Logbook CameraCommissioning ORM Oct18
Enter comments in reverse time order
Glossary at the end!
Date | Actor/Author | Action summary | Comments | Documents | |
---|---|---|---|---|---|
2018-12-04 | Léa, Seiya, Shunsuke, Satoshi | Caco-ECC error/undefined state |
- First power up, ECC went to error state whereas Caco was fine - For one of the power up in which we stay 1h30 on ready mode ECC went to error and Caco to undefined (we don't know in which order). ECC to state 1 after disabling the _error_heart_bit but then no current in the busbar->hardreset Then impossible to get Caco back in a normal mode even after ECC to state 1-> kill and restart Cacolaucher... - After the second hard reset, fan was off so we add to do another hardreset... Then everything ok | ||
2018-12-04 | Léa, Seiya, Shunsuke, Satoshi | TIB 255 issue fixed |
- We forgot to call the reset() method between each initialization of the modules. Now it works fine. | ||
2018-12-04 | Léa, Seiya, Shunsuke, Satoshi | start up |
3)All modules ON and all pixel ON (data taking but then TIB state 255) 4)All modules ON and all pixel ON (data taking but then TIB state 255) 5)All modules ON and all pixel ON (Caco went to undefined and ECC to error state, we don't know in wich order): For this start up we stayed one hour and half on ready mode 7)All modules ON and all pixel ON
| ||
2018-12-04 | Léa, Seiya, Shunsuke, Satoshi | start up |
1) SiwthOn() from Caco, went to state 2 normal, but ECC to state 4 -> hard reset 2) Caco and ECC fine but one module OFF: 10.1.6.28 -> OFF/ON again 6) ECC went to error state and Caco to state undefined (We don't know in wich order...). Before hard reset, we try directly from ECC to go to state ready but current to 0 in the bus bars ->hard reset Then The fan speed were at 0 so hard reset again. Then it works fine | ||
2018-12-03 | Dirk | CamerasToACTL v1.7 |
Installed on tcs03 and tcs04 from repository (https://cta.cppm.in2p3.fr/repo/x86_64/) and tested with/by Léa and Seiya. | ||
2018-12-03 | Léa,Seiya | increase of the bus bars current |
Due to the BP reset wheread EVB was still connected to the modules, current increase to 40. It is now know than the current increase when DAQ is connected and no clock distributed... We still don't know why, in study!!! | ||
2018-12-03 | Léa, Seiya | Data Taking |
- TP synchronised with legacy and EVB data - Too many files created compared to the number of ZFW instances | ||
2018-12-03 | Léa,Seiya | TIB/UCTS |
- TIB went to state 255 even after reset. so we shut down and off the camera... - Again, TIB went to 255 5 secdondes after reaching state 5, all rate at 1444O. - TIB went to 255 from state 4 directly. The feeling in one day is that after 3 cycle of TIB going to 5 and then reset, it is going to 255 and we have to switch off the camera | ||
2018-12-03 | Léa | Fix UCTS configuraiton |
- a virtual machine was using the IP 10.4.8.4 of the UCTS.... this is why it was not possible to configure it. - I change to the IP it should take in the future: 10.1.4.4 and now it works
| ||
2018-12-03 | Léa | Power up |
- 1): All modules ON -2) All module ON -3) All module ON -4) All module ON -5) All modules ON -6) module 10.1.6.40 off -7) All module ON -7) All module OFF (I think due to the previous increase of the current in the bus bars) -8) All moduls ON | ||
2018-11-30 | Léa | TIB/UCTS |
- TIB remains in state 2 even when UCTS is configured | ||
2018-11-29 | Léa | Data Taking |
- Procedure of TP synchronised in all the modules - EVB configuration file in /home/dragon/EVB/20181130 - First try, EVB conected but one module busy: 10.1.6.10 -> initialisation again of the modules - Second try, EVB connected, all modules no busy but TIB remains at state 2 even if UCTS configuration seems ok -> I switched OFF and ON the camera... - Third try, same than before. Try now to disconnect the cable from the WR switch to TCS07. same problem TIB remains at state 2 when UCTS is configured.
| ||
2018-11-29 | Léa | Power up |
- Fourth startup: ECC and Caco works well, All modules ON - Five startup: ECC and Caco works well, All modules ON - sixth startup: ECC and Caco works well, All modules ON | ||
2018-11-30 | Léa | Power up |
- second start up: ECC and Caco works well but module 10.1.6.28 was OFF so I started again -Third startup: ECC and Caco works well, all module ON. But humane mistake (mine), ECC went to Error state and then no current in the pulse bar -> hard reset | ||
2018-11-30 | Léa | Power up |
- From Caco, switchON() it went to his state 2, then good communication with the ECC. Then GetCameraStanby(), ECC was fine and went to state ready, all modules ON but Caco was in an undefined state so I did a sleep(), Caco recover his state ready (state=3) and ECC was still ready. I did a second call of the sleep() method to start from a clean environment and everything was fine Caco went to state safe and ECC also. | ||
2018-11-29 | Léa | Power up |
- after the second hardreset, Fan ON and ECC went to ready from Cacoo day finished(-: - All modules ON, configuration for TP synchronisation in all the modules seems fine. EVB segfault in GOTOREADY s | ||
2018-11-29 | Léa | Fan Off |
Following the hard reset since ECC went to error, again as yesterday morning the fan were down... It is a problem of heart_beat between the PDB and ECC - Second hard reset | ||
2018-11-29 | Léa | Power up |
1) Power up from Caco, powerON and GetCameraStandby(), ECC to Ready and all modules ON. Monitoring issue so go back to safe the time it is fixed 2) From Caco, SwithON, then ECC goes to errorstate 4 with _error_heart_bit to 4 without any clear reason. Caco was ok on state 2 3) Fixe the _error_heart_bit issue of ECC and try again to swtich on from Caco. Same issue, Caco state fine but ECC went to error state 4 due to _error_heart_beat at true. 4) Fixed _error_heart_beat and try directly to switch on from ECC. Works fine, ECC went to Ready but no current in the pulse bar, only the 4 one had current. I did a hardreset | ||
2018-11-29 | Léa | WR switch |
RJ45 port installed on port 9 of the WR switch for the connection to tcs07 | ||
2018-11-28 | Daniel K., Léa | test of cluscolauncher |
We tried the connection between Caco and Clusco: all fine. The current monitoring was not active because not the same files were updated. Will be fixed soon and then tested again. | ||
2018-11-28 | Daniel K., Léa | fans stopped |
This morning around 8:45am the fans stopped running before we arrived on site. When we arrived we noticed the ECC was still in safe state (we expected error satte but it was not the case). We checked the rest of the ECC variables and everything looked fine. Using a multimeter we checked the 400 was properly arriving to the PDB inside the camera. We contacted the ECC experts that asked for screenshots of the ECC datapoints regarding the PDB for later evaluation of the problem. Then we hard reset the ECC and fans started just fine.
| ||
2018-11-27 | Daniel K., Cristobal (remote) | fix of a compilation problem for ClusCo |
Small fix for compilation, tested and merged to the master branch. Compilation on site works again. | ||
2018-11-27 | Daniel K. | test of new ECC version |
Following and more extensive tests of the control of the individual intelligent relays with the new version of the ECC. No improvment. Detailed description of the tests performed will be emailed to the experts. As a consequence the old ECC version was reinstalled for now. | ||
2018-11-27 | Daniel K., Otger (remote) | installation of new version of libcluster |
Following successful test of last week the fixes of libcluster were merged in the master branch and install on site. | ||
2018-11-26 | Daniel K. | test of new ECC version |
After some small fixes of data point "Error description" and for control of the fans, the new version of the ECC version was tested. The control of the individual intelligent relays (main update with this version) was unstable. As a consequence the old ECC version was reinstalled for now. | ||
2018-11-23 | Daniel K., Yuki, Seiya | too high temprature |
The status of ECC monitoring went to "red" from "green" around 16:30pm. The change of temperature we are monitoring was quite different from as usual. It may be related with the water pressure of chiller. It is above 1 and stable as usual, but it was too low at the morning and rising during the day. |
Media:bptemp.JPG Media:Tempbad.JPG Media:ECCTemp.png Media:CameraPressure14-16.png | |
2018-11-23 | Daniel K., Yuki, Seiya | some network interface of osaka sometimes not running |
some network interface of the osaka server doesn't start running at first every day... We activated p1p2 manually.
|
||
2018-11-23 | Daniel K., Yuki, Seiya | bad behaviour of mezzanine |
After configuration of modules(init7 & pulse_injection_all), bad behavior of mezzanine was shown at three modules.
|
||
2018-11-23 | Daniel K., Yuki, Seiya | take data for TP synchronization |
We took data with ClusCo monitoring for the test pulse synchronization. 1) 300Hz, ROI=1024, trigger was generated by mod265, 3000events
2) 300Hz, ROI=40, trigger was generated by all modules, 3000events
|
||
2018-11-23 | Daniel K., Yuki, Seiya | validation test of ClusCo |
- the strange value of humidity - SiTCP reset
|
||
2018-11-23 | Daniel K., Yuki, Seiya | new version of ECC |
We implemented the new version of ECC. After reboot of ECC the fan didn't start working.
As a result we decided to replace it with the current version of ECC. After reboot of ECC, all function worked well. |
||
2018-11-23 | Daniel K., Yuki, Seiya | monitoring plots were not updated |
ECC monitoring plots were not updated after 9:30am. We can get various values(temperature etc.) in OPCUA client, only monitoring plots were not updated. After reboot of ECC for the update of ECC version monitoring plots started to be updated again. |
||
2018-11-22 | Yuki, Seiya | take data for TP synchronization study |
I discussed with Taka, then I tried to take data as below;
We took data with the following conditions and managed to synchronize test pulse at all modules finally. 0)
1) 22Hz, ROI=1024, trigger was generated by mod265, 1000events
2) 300Hz, ROI=1024, trigger was generated by mod1, 3000events
|
||
2018-11-21 | Seiya | home directory of osaka server was full |
Home directory of osaka (/home) went to be full today. Osaka ~ > df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/scientific-root 50G 22G 29G 43% / devtmpfs 252G 0 252G 0% /dev tmpfs 252G 0 252G 0% /dev/shm tmpfs 252G 50M 252G 1% /run tmpfs 252G 0 252G 0% /sys/fs/cgroup /dev/sdb 15T 8.5T 5.3T 62% /mnt/cs1 /dev/sda1 497M 272M 226M 55% /boot /dev/mapper/scientific-home 504G 504G 20K 100% /home tmpfs 51G 12K 51G 1% /run/user/42 tmpfs 51G 4.0K 51G 1% /run/user/1000 tmpfs 51G 0 51G 0% /run/user/1001 tmpfs 51G 0 51G 0% /run/user/1002 Almost all of files(~80%) are the data taken by LegacyDAQ for the tests and in /home/dragon/IACMiniCamSetup/DragonDaqM Osaka DragonDaqM > du -sh . 417G So I moved the data taken by LegacyDAQ to /mnt/cs1/store/DragonDaqData temporary. (We could transfer those data on the Lustre sytem (/fefs/ on tcs) later.) | ||
2018-11-21 | Daniel K., Seiya | take data with LegacyDAQ for EVB tests |
Julien wants to use raw data of full camera for EVB debug tests. We took data with LegacyDAQ by random trigger(300Hz), which is digital pedestal trigger TIB generated. These files are in /mnt/cs1/store/DragonDaqData/Data20181121. I wanted to take 30min data(300Hz*(60*30)=540,000 events), but the disk in osaka server went to be full during the test. The size of each file is ~219MB, which is equivalent to ~168,000 events and ~10min data.
| ||
2018-11-21 | Seiya | how to run again the network interface |
Some network at osaka server sometimes stopped running. p2p2: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 10.1.6.192 netmask 255.255.255.128 broadcast 10.1.6.255 inet6 fe80::a236:9fff:fef0:ccd6 prefixlen 64 scopeid 0x20<link> ether a0:36:9f:f0:cc:d6 txqueuelen 1000 (Ethernet) RX packets 68478703 bytes 95858361688 (89.2 GiB) RX errors 1 dropped 9 overruns 0 frame 1 TX packets 30112278 bytes 1622848602 (1.5 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 At that time, we should do for restart running;
| ||
2018-11-21 | Otger, Daniel K., Seiya | ECC went to error state |
I used ClusCo@tcs01 for the monitoring, all the plots except "Amp. Temp" was updated indeed. After that, I did init7 from ClusCo@cacooperator and waited the update of "Amp. Temp" plot. At that time, ClusCo@tcs01 showed timeout, so I realized I cannot ping these modules and relay current went to 0 and ECC state went to error state(4). I powered up again and ECC status went to 2(ready) as usual, but relay current was 0. Taka explained why relay current was still 0 as below; When ECC goes to Error state, relay modules are also in a strange state. You need to reset relay modules as well. However, even if you go to "safe" state in ECC, relays are still powered (not bus bars, but relay modules). That means, "safe" does not reset relays. Lea explained why ECC went to error as below; Maybe what is possible also is that you lost the slow control connection during few seconds and then get it back without realising. Then If the modules are ON and that we lost the slow control connection, ECC goes to error and the relay current will remain at 0 as Taka explained. We did hardware reset three times(15:00, 15:50, 16:45), but the situation was same. This ECC error state seemes to be caused by loss of heart beat of CaCo. We survived without CaCo (directory use ECC) for data taking today.
1228269 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] ERROR com.prosysopc.ua.client.UaClient - Exception in ServerStatusListener java.lang.ClassCastException: cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAVariable$DataInformation cannot be cast to java.lang.Integer at cat.ifae.cta.cameracontrol.server.base.clients.ecc.OPCUAECCControl$ECCVariableStatus.update(OPCUAECCControl.java:25) at java.util.Observable.notifyObservers(Observable.java:159) at cat.ifae.cta.opcua.dataaccess.basicobjects.BasicCallbackVariable$ObservableVariable.setValue(BasicCallbackVariable.java:36) at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAAssembly._newStateWarn(OPCUAAssembly.java:533) at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAAssembly.consumeMessage(OPCUAAssembly.java:526) at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAServerStatusListener.statusChanged(OPCUAServerStatusListener.java:59) at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAServerStatusListener.onStateChange(OPCUAServerStatusListener.java:33) at com.prosysopc.ua.client.UaClient.a(Unknown Source) at com.prosysopc.ua.client.UaClient.updateServerStatus(Unknown Source) at com.prosysopc.ua.client.UaClient$a.run(Unknown Source) at java.lang.Thread.run(Thread.java:745) 1228371 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] WARN com.prosysopc.ua.client.Subscription - Server sent a previously acknowledged sequence number 0 for Subscription 47786 1228372 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] INFO org.opcfoundation.ua.transport.tcp.io.SecureChannelTcp - 47856 Closed 1228372 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] INFO org.opcfoundation.ua.transport.tcp.io.TcpConnection - /10.1.4.66:4841 Closed 1228373 [TcpConnection/Read] INFO org.opcfoundation.ua.transport.tcp.io.TcpConnection - /10.1.4.66:4841 Closed (expected) | ||
2018-11-21 | Otger, Daniel K., Seiya | dhcpd server for TIB restart |
DHCPd server for TIB stopped due to the shutdown of tcs01 yesterday, so we activated the server as below, ifae@tcs01 ~]$ sudo service dhcpd status Redirecting to /bin/systemctl status dhcpd.service ● dhcpd.service - DHCPv4 Server Daemon Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled) Active: inactive (dead) Docs: man:dhcpd(8) man:dhcpd.conf(5) [ifae@tcs01 ~]$ sudo service dhcpd start Redirecting to /bin/systemctl start dhcpd.service [ifae@tcs01 ~]$ sudo service dhcpd status Redirecting to /bin/systemctl status dhcpd.service ● dhcpd.service - DHCPv4 Server Daemon Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2018-11-21 09:14:08 WET; 2s ago Docs: man:dhcpd(8) man:dhcpd.conf(5) Main PID: 453 (dhcpd) Status: "Dispatching packets..." CGroup: /system.slice/dhcpd.service └─453 /usr/sbin/dhcpd -f -cf /etc/dhcp/dhcpd.conf -user dhcpd -group dhcpd --no-pid Nov 21 09:14:08 tcs01 dhcpd[453]: All rights reserved. Nov 21 09:14:08 tcs01 dhcpd[453]: For info, please visit https://www.isc.org/software/dhcp/ Nov 21 09:14:08 tcs01 dhcpd[453]: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in...ig file Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 0 deleted host decls to leases file. Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 0 new dynamic host decls to leases file. Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 4 leases to leases file. Nov 21 09:14:08 tcs01 dhcpd[453]: Listening on LPF/ens1f0/a0:36:9f:eb:51:34/10.1.0.0/16 Nov 21 09:14:08 tcs01 dhcpd[453]: Sending on LPF/ens1f0/a0:36:9f:eb:51:34/10.1.0.0/16 Nov 21 09:14:08 tcs01 dhcpd[453]: Sending on Socket/fallback/fallback-net Nov 21 09:14:08 tcs01 systemd[1]: Started DHCPv4 Server Daemon. Hint: Some lines were ellipsized, use -l to show in full. | ||
2018-11-20 | Daniel K., Seiya | ClusCo monitoring restart | ClusCo monitoring map was not updated after the shutdown of tcs01. We contacted with Carlos and Carlos and they restarted it again.Now it works. | ||
2018-11-20 | TCS01 shutdown | One of the memory cards of tcs01 is damaged and will be replaced
by an authorized technician today starting 9am La Palma time. We will shutdown the server before that and once the card is exchanged we start it up again. | |||
2018-11-19 | Seiya, Daniel K. | cannot connect with some modules | With the configuration2(100Hz,ROI=1024) we could not connect some modules(IP10.1.6.148-173) and they still busy(busy state=1).
After the re-initianlization, this problem disappeared. | ||
2018-11-19 | Seiya, Daniel K. | Test pulse data with DragonDaqM(LegacyDAQ) | We took test pulse datat with the following conditions;
1) 300Hz, ROI=1024, trigger was generated by mod265 (for reproducing the problem)
2) 100Hz, ROI=1024, trigger was generated by mod265 (suggested by Taka)
3) 300Hz, ROI=1024, trigger was generated by mod265 (suggested by Taka)
4) 100Hz, ROI=1024, trigger was generated by mod265
| ||
2018-11-19 | Seiya, Daniel K. | 24V supply problem | We powered up the camera with the usual procedure, but only one busbar(the 4th one) worked and others didn't work. We tried this procedure again, but the result is the same(only the 4th busbar worked).So we switched off and on the camera breaker around 15pm. Fan didn't start to work at first, so I switched on and off the breaker again and fan started to work. After that we can power up the whole cameras.
| ||
2018-11-13 | Mitsunari, Daniel K. | Software deployment | All the setup (except the uaexpert for ecc, tib and ucts) to control, monitor and take data with the camera was moved to the LST_CALP iMac (+ 1 screen) of the commissioning container. | ||
2018-11-12 | Mitsunari, Daniel K. | Test pulse data with DragonDaqM | Test pulse data were taken by DragonDaqM triggering by the module 264, which did not have a test pulse on 11-09. | ||
2018-11-12 | Mitsunari, Satoshi | Connect tcs07 to White Rabbit | WR switch management port and Management switch (mgtsw2 port 42) are connected by a Ethernet cable. Mitsunari tried to change the IP of the WR switch to 10.200.10.140, which is in VLAN 1001, but I failed. The WR interface file dot-config was not found in spite of the WR manual. Even when we created the file by ourselves, it was lost after rebooting. | ||
2018-11-12 | Mitsunari, Daniel K., Carlos Diaz | Software deployment | Installing and compiling caco, cacoconsole, cacogui on tcs01 under /home/ifae/development. Compiling /home/ifae/clusco on tcs01 and adapting monitoring from CIEMAT. Setting up one additional screen for monitoring to the imac (monitoring computer), adding two forms (one for powering on the camera, one for shutting it down) to be filled by the operators. | ||
2018-11-09 | Mitsunari, Daniel K. | Test pulse data with EVB | Test pulse data were taken by EVB waiting PPS reaching all modules for 2 s. For the read depth 40, DAQ seemed to be successful. For the read depth 1024, however, the data were not stored. | ||
2018-11-09 | Mitsunari, Daniel K. | Test pulse data with DragonDaqM | Test pulse data were taken by DragonDaqM waiting PPS reaching all modules for 2 s. The waveform data of six modules besides the central one were checked, and five modules had test pulses though the other module (No. 0) did not. | ||
2018-11-03 | Mitsunari | Test pulse injection timing | Test pulse data were taken with L1 threshold which all modules can produce camera trigger. According to the data, the timing of test pulse injection distributes aver ~70 ms. Test pulse injection rate: 1 Hz, Read depth: 40, Sampling speed: 1 GHz | ||
2018-11-03 | Mitsunari | Test pulse injection timing | Test pulse data were taken with L1 threshold which all modules can produce camera trigger. According to the data, the timing of test pulse injection distributes aver ~70 ms. Test pulse injection rate: 1 Hz, Read depth: 1, Sampling speed: 5 GHz | ||
2018-11-02 | Mitsunari | Test pulse data with EVB | Data for investigating the test pulse issue were taken with EVB but seems to be failed. This should be inspected. Pulse rate: 300Hz, Read depth: 1024, Event number: ~9000, /fefs/onsite/data/20181102 | ||
2018-11-01 | Mitsunari | Large data with random trigger | Data of ~10^5 events were taken for pedestal random tirgger, EVB, the read depth 40 slices, and the dealy 3528 ns. The data are stored in /fefs/onsite/data/20181101.
| ||
2018-11-01 | Mitsunari | Avoiding TIB State 255 | The TIB state can go to 5 without resetting at state 255 by a combination of reseting TIB at state 0 and configuring dragons without resetting BPs.
Mitsunari repeated this procedure four times and succeeded for all of them. DAQ also seemed to be successful at the last trial. (At the first three trials, DAQ failed because of another reason.) | ||
2018-10-31 | Mitsunari | TIB State 255 problem | init7 without BP reset at the beginning was tested. The first trial failed, namely, the state turned out to be 255. However, TIB state directly went to 5 In the second trial, when TIB was Reset just after turning on Camera. This behavior should be confirmed later. | ||
2018-10-31 | Mitsunari | Check for test pulse synchronization | It should be confirmed whether TenMHz counter vaue is idential among the modules for each test pulse event. Data for the check were taken by DragonDaqM with 300Hz. L1 threshold was set so that only the central module sent triggers. The data were stored in /home/dragon/IACMiniCamSetUp/DragonDaqM/Data20181031. TenMHz counter appeared to be synchronized, but it should be confirmed. | ||
2018-10-31 | Oscar, Mitsunari | PDB Fixation |
PDB fixation: the fixation of the from plate is done know throw a screw and nut fixed to the back plate using a mixture to attach metals (Pattex Nural 21) and an additional nut to fix the front plate. We have started Modules twice with one hour break in between. Both times all Dragons and BP went up. | ||
2018-10-30 | Taka, Mitsunari, Julien, Dirk | Random trigger runs with EVB |
Two runs (#30, #31) taken at various trigger rates as documented in Run Catalog and Slack. Corrected pixel map implemented (spiral numbering).
| ||
2018-10-29 | Oscar, Taka, Mitsunari | Power up |
The Dragon with IP 10.1.6.28 (3rd column starting by the left from outside, 5th modules from below) was put in the busbar powered by relay 1 instead of 0. In exchange, module in 4th column 5ht from b below was put in the relay 0 instead of relay 1. Camera was powered up only once and all modules and BP went up.
| ||
2018-10-27 | Taka, Mitsunari | Random Trigger |
We took the random trigger. Following the instruction with Lea, random trigger could be easily produced. With DragonDaqM, 300 Hz injection -> 300 Daq rate. 1k Hz-> 783 Hz 3k Hz-> 1162 Hz 6.5k Hz -> 1303 Hz. With EVB, we first tried with 6.5 kHz. Then EVB crashed because of buffer full. But busy state of modules was 03, which means EVB are connected and modules were busy. To recover from this state, we had to reboot Dragons. A few minutes later, Carlos Diaz called us. The current consumption at bus bars are ~10Amp higher than usual. Normally 25-27 Amp but after rebooting Dragons, it was 35 Amp. We shutdown the 24V. After 10 min or so, Carlos allowed us to restart. All Dragons could be communicated from cacoserver, but not from Osaka. ip link set p*p* down/up didn't help. We rebooted Osaka. Then Osaka could ping to all (but one) modules. However, EVB didn't work. Later we learned from Dirk and Julien that we had to do sudo modprobe -r ixgbe; sudo modprobe ixgbe
| ||
2018-10-27 | Oscar, Laia , Taka, Mitsunari | Power up |
After checking that Dragon and BP regulators can stand input voltage above 30 V, we increased the power provide by the Power Supplies to 27.5V (the same for the 8 Power Supplies). With this configuration, the voltage while ramping up increase up 20.3 V and then only decreases to 19.8 V for about 1 ms. This should be completely find for the Dragons. We power up the camera with the ECC 10 times. All BP went up all times. Only one Dragon (always the same) does not power up the first time after a ~1 hour break (tried two times), after this first power up all Dragons power up. | ||
2018-10-26 | Taka, Mitsunari | TIB state machine. |
We tried to solve the "State 255" problem in TIB. Luis Angel suggested to configure modules at state 2. We followed his instruction, but we reached state 255. So we tried modules configuration at state 0. Same result. We tried module configuration at state 4, resulting in the same state 255. We also tried to see the test pulse postion to the center of the readout window. But we could not see the test pulse at all. Delay setting in TIB or backplane is not correct.
| ||
2018-10-26 | Oscar, Laia , Taka, Mitsunari | Power up |
The drop in the voltage is due to a current limit in the circuitry of the relay. Increasing the voltage of the power supplies should rise the value of the dip in the voltage so that it does not reach 18V. We measure again the transients for relay 0 with Power Supply at 24.98 V as reference. We increase the voltage of Power Supplies to 25.25 V, the dip is about 100 mV higher. | ||
2018-10-25 | Taka, Mitsunari Yusuke | Event Mixing |
We understood the origin of EventMixing. It is due to the slow control command "Dragon - Start" after "Enable Trigger" in TIB. "Enable Trigger" should have been after "Dragon Start". This is dangerous actually. Mistake will be noticed only during analysis.
| ||
2018-10-25 | Oscar , Laia | Power up |
No water was found inside the camera. We measure the voltage at the output of the Redundancy modules: 24.98 V We connect a Current sensor between master bus bar and relay 0. We power up relay 0 and measure transient for both current and voltage: - Voltage shows a drop of around 1.5 V once it arrive at 20V, which is afterward (4 ms) recovered and keeps increasing until about 24.5 V - Current increases steadily with a small slope change on the drop on the voltage happens. It also show a drop of about 30% when the voltage reach 24.5 V that it recovers after about 80ms after The voltage reduction for 4ms brings the voltage very close to 18V, and some times may go slightly down. The same is observed in relay 1. | ||
2018-10-21 | Taka, Mitsunari, Yusuke | Timing Calibration. |
We tried to see the test pulse in the center of window. But we did not succeed. DAQ was with EthDisp from Taka's macbook through slow control network. We need to understand the delay in TIB and backplane. Since it was already 5:50 pm, (though we announced that we use camera until 5:00 pm) we had to shutdown. We kept 230 and 400V on, chiller on, only 24V off. | ||
2018-10-21 | Taka, Mitsunari, Yusuke | Event Mixing Test |
To confirm again the event mixing problem, we took data with the LegacyDaq. After init7.uic, we injected the test pulse in the central module with 300 Hz. TIB could see the rate properly. We took 20000 events. After that, we tried to take data with EVB, but it was not successful. EVB could not connect to all modules. We had the same problem a few times in a row. One of the reasons was dead ports in Osaka. Sometimes, ports in Osaka sleep without obvious reason. This is actually critical problem. We need to investigate further. Finally we gave up to take data with EVB. | ||
2018-10-21 | Taka, Mitsunari, Yusuke | TIB/UCTS study |
After power up, we tried to initialize TIB. But state didn't reach "5". After state 4, if we enable trigger, state went to 255. We knew that the RJ45 cable on the WR was damaged by the rack door. We changed it to new cable. We also used a different port in WR (port 8->5).And we reset TIB. Then with the standard procedure, state reached 5. We were happy. Just to be sure, we changed back to the damaged cable and retried. Then state was again 5. So, the reason was not the cable. But "Reset" of TIB was the key. After initializing PMT modules, TIB didn't work well. It didn't send back the trigger. Since temperature was too high (BP 35 deg.) We had to switch off the 24V. During this break, we changed the WR port from 5 to 8. After power up, we repeated the procedure. Again, TIB didn't send back the trigger. But, TIB reset helped. So, currently, startup recipe is that 0->1->2->3->4->255->TIB Reset->0->1->2->3->4->5->configure modules -> TIB Reset -> 0 -> 1 ->2 ->3 ->4 ->5.
| ||
2018-10-21 | Taka, Mitsunari, Yusuke | Restart the Camera. |
Before powering up 400V for the first time since last Tuesday, we examined the camera visually. Camera is properly parked. There was a water condensation on the camera body. The platform is not perfectly closed. There was a 2 cm gap between left and right. But it is not dangerous for us. At 11:45, we applied 400 V putting the breaker at the Drive container. After 15 min of stabilization, we started 24V from ECC (state ready). Then, we realized that TIB and UCTS do not respond on Ping. It was because dhcpd on tcs01 was dead. Also, uctsd on Osaka was dead We restarted dhcpd and uctsd and switched off and on 24V. Then, TIB, UCTS could be booted.|
| ||
2018-10-15 | Dirk | UCTSd dead. | ● uctsd.service - Execute the UCTS OPC-UA server Loaded: loaded (/etc/systemd/system/uctsd.service; static; vendor preset: disabled) Active: failed (Result: exit-code) since Di 2018-10-16 13:22:28 WEST; 2h 59min ago Process: 152844 ExecStart=/home/dragon/ucm_temp/ucts_opcua_server.sh (code=exited, status=134) Main PID: 152844 (code=exited, status=134) Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: (MOS) : Info : 2018-10-13.12:30:47 : Connected to Server : opc.tcp://osaka:48010 Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: (MOS) : Info : 2018-10-13.12:30:47 : Verification of MOS version with lappweb Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: ******************** Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: Press CTRL-C to shutdown server Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: /home/dragon/ucm_temp/ucts_opcua_server.sh: line 9: 152847 Aborted (core dumped) ./MOS_Device -d /MOS/plugins/Plugin_UCTS/UCTS.xml Okt 16 13:22:28 osaka systemd[1]: uctsd.service: main process exited, code=exited, status=134/n/a Okt 16 13:22:28 osaka systemd[1]: Unit uctsd.service entered failed state. Okt 16 13:22:28 osaka systemd[1]: uctsd.service failed. Restarted. | ||
2018-10-15 | Taka | MOXA Switch connected | SLOW control connection intact. Drive network can be used from remote tomorrow. | ||
2018-10-16 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | Modules deconection |
- It happens two times today that after around 25 minutes, around 15 modules were not powered anymore whereas ECC was in state 2 and current in the pulse bar. At the newt switch ON, ALL powered | ||
2018-10-16 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | uaexpert deconnection |
- Again, we lost uaexpert that was completely stuck so to have the monitoring back again the DataLogger are now written in /home/cacooperator/CoolingSystem/20181016_003 and 20181016_004 | ||
2018-10-16 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | TIB issues |
- TIB goes from state 0 to 4 but then when we enable trigger go to state 255 as the alarms vector | ||
2018-10-16 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | Small run summary |
- 7 GotoSafe and GoToReady for the ECC due to too high temperatures so switch ON/Off of the Modules/BP: 1) All module ON, 2 BPs OFF associated to module 10.1.6.12 and 10.1.6.27 2) All module ON, 2 BPs OFF associated to module 10.1.6.24 and 10.1.6.27 3) All module ON, 1 BP OFF associated to module 10.1.7.171 4) All modules and BPs ON 5) All modules and BPs ON 6) All modules and BPs ON 7) All modules and BPs ON 8) All modules and BPs ON 9) All modules and BPs ON 10) All modules and BPs ON 11) All module ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.149 12) All module ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.149 13) ALL modules ON 14) ALL modules ON
| ||
2018-10-15 | Dirk | Charging Walkie-Talkies | with our private mini-USB adapters, while waiting for the real charger to reappear | Alternative: $8.99 on Amazon
| |
2018-10-15 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | Slow control and uaexpert deconnection |
- Slow control connection lost in ready mode so then no more current in the pulse bar. GotOsafe GOtOready still no curent with the negative value in the pulse bar. We had to switch off and on the 233 and 400 V - We lost uaexpert that was completely stuck so to have the monitoring back again the DataLogger are now written in /home/cacooperator/CoolingSystem/20181015_005 and 20181015_006 | ||
2018-10-15 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | Small run summary |
- 7 GotoSafe and GoToReady for the ECC due to too high temperatures so switch ON/Off of the Modules/BP: - Cut busy propagation from BP, dragon on local clock 1) 1 module OFF: 10.1.6.28, 2 BPs OFF associated to module 10.1.6.24 and 10.1.6.27 2) 1 module OFF: 10.1.6.28, 3 BPs OFF associated to module 10.1.6.24, 10.1.6.27 and 10.1.7.147 3) 1 module OFF: 10.1.6.28, 1 BP OFF associated to module 10.1.7.147 4) All modules ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.149 5) All modules ON and All BPs ON - pb of internal/external trigger clock for the Dragon fixed: For DRS4, referential clock is now 10 MHz external clock. - Configuration of UCTS and TIB, two last runs taken with the TIB so with external clock, external trigger and busy propagation. 6) 1 module OFF: 10.1.5.16, 3 BPs OFF associated to module 10.1.6.27 and 10.1.7.146 and 10.1.7.149 7) 1 module OFF: 10.1.6.28, 1 BP OFF associated to module 10.1.6.28 8) ALL modules On
| ||
2018-10-15 | Léa, Taka, Dirk, Daniel | SLOW Control lost |
While Camera was on, the SLOW control connection was interrupted in the Drive container to prepare connection of Drive/AMC network. Consequently the EMC went to SAFE. But also the UaExpert interface was stuck (which is the current base for Camera monitoring). The setup was then restored as well as we could, including DataLogger function. | ||
2018-10-15 | Léa, Taka, Julien, Seiya, Dirk | writing speed limitation in data taking |
1 ZFW: validate speed 300 MB writing speed 8 ZFW: validate 8* 300 MB/s writing speed 16 ZFW: writing speed: 16*150 MB writing speed. Maybe problem due to the disk. To investigate | ||
2018-10-15 | Léa, Dirk, Julien | Slow control deconnection+ disconnect from OPC-UA | |||
2018-10-14 | Léa, Dirk, Julien | Small run summary | - No TIB/UCTS
- Cut busy propagation from BP, dragon on local clock - 7 GotoSafe and GoToReady for the ECC due to too high temperatures so 7 switch ON/Off of the Modules/BP 1) All modules ON, 2 BPs OFF associated to module 10.1.6.24 and 10.1.6.27 2) All modules ON, didn't check the BPs 3) All modules ON, didn't check the BPs 4) All modules ON, 1 BPs OFF associated to module 10.1.7.147 5) All modules ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.171 6) All modules ON, 1 BPs OFF associated to 10.1.7.147 7) All modules ON, 1 BP OFF associated to module 10.1.7.147
|
||
2018-10-14 | Eric, Dirk | All (DATA) fibres straight now! |
|
||
2018-10-14 | Dirk | Direct measurement of TX lasers |
INFO: Direct measurements can be done without danger for Photom-211 測定範囲 -70 ~ +5dBm according to datasheet. That is 3.16 mW to 0.1 µW. |
||
2018-10-14 | Léa, Dirk, Julien | First full-cam data run up to 15kHz! | That is what we would have liked to see last week.
Now it's Champagne time. :-) |
||
2018-10-14 | Julien, Eric | Fibres DATAsp1-6 tested | Optically, between DC and Cam. | ||
2018-10-13 | Léa, Dirk, Julien | Run0015 | Still no UCTS(/TIB); fibre broken between DC and Cam. Running with half-cam and two additional missing modules (BP problem): 6.24, 6.27.
- r0015 all events (at runstart), 300Hz and 10kHz, but ZFW problems (testing with 16 instances). |
||
ALL | Door knob! | Falling apart from the CC door. Urgent action needed. (Bigger screw?) | |||
Taka, Seiya, Mitsunari | Fiber Check | We checked optical connection between DC and Camera because some labels were lost due to UV damage. We checked Data2, Data 6, SlowControl and UCTS. Only UCTS had a problem (no splicing at Drive PP). The rest where OK. | |||
Taka, Seiya, Mitsunari | Labeling fibers | We labeled optical fibers of the data (DATA 1 - 6) at patch panels in Drive contaniner and in IT container. The spare cables have not been done yet because a ribbon ran out. | |||
Seiya, Mitsunari | Connection validation | We validated 12 optical fiber connection (No. 1-6, 13-18) from Drive coontainer to IT container. Strength is -35 to -38 dBm. | |||
2018-10-12 | Léa, Taka, Julien, Seiya, Dirk | Runs0012 sqq. | No UCTS/TIB today.
These runs have 3 modules missing (as identified in the preparatory phase: 6.21, 6.25, 6.28). According to a quick check, all EventNb=TriggerNb otherwise for all runs today. See RunCatalog for details. |
||
Dirk | Creation of logbook | ||||
Dirk, Taka, Julien, Seiya, Léa | Data acquisition |
- Pb with the ClusCo on tcs01. The root propagation for the BPs for the trigger doesn't work. Using exactly the same script it works on CacoOperator. - We validate for 3 fibers the new connections to the dataswitch fiber. Eric is fixing the one missing or broken. So for now only the right part of the camera is used for data acquisition - No TIB/ UCTS - Few runs were taken with no external trigger from TIB. 3 Modules didn't appear busy but didn't sent any data. In those test the busy from the CBP was cut. Those missing modules have to be investigated in more details but due to a lot of slow control deconnection problem and high temperature in the camera it was not possible. Script used in CLusCo: init7_noextTrigger_Test.uic. To not cut the busy, name is: init7_noextTrigger_noBusyCut_Test.uic - One try with no external trigger and clock but with the CBP delivering the clock and pps and using 10 MHz clock as default clock for the dragons. L1 local Trigger didn't generated. No we come back to a configuration of dragon on their local clock but this issue has to be investigated. Script used in CLusCo: init7_noextTriggerClock_Test.uic | |||
Dirk, Taka, Julien, Seiya, Léa | Too high temperatures in the Camera |
- Limit fix to 27 for the aire temperature inside and 35 for the BP temperature. Pression also get some alarms - During the day, due to high temperatures we have to gotosafe to wait for the camera to cool almost 11 times but never the BP max temperature went more than 34 degree. The air inside reach at the maximum 26.5. | |||
Dirk, Taka, Julien, Seiya, Léa | ECC lost connection (2 times!) |
- In the afternoon, 3 lost of ECC slow control communication due to the interruption between IT-Container/Driver-Container. Miscomunnication with AMC people... First time, temperature was already high in the camera, we had to switch OFF the 233 and 400V for security reasons. Two other times, we get the ECC connection back quite fast and ECC was in the same state that when the connection was lost meaning state 2 ready. Just no more current in the pulse bar so we have to gotosafe and gotoready both times. After that the current was -247 in the pulse bar... Not understood for the moment - The second interruption happened, when the Moxa switch was reconnected, probably not correctly configured. It was disconnected again. Presently this impacts AMC and drive operation, until the Moxa can be reconnected. | |||
Léa, Taka, Julien, Seiya, Dirk | Discovered SLOW control fiber lost, fibers changed connection recover | Interruption. Using UCTS section for replacement. | |||
2018-10-11 | Eric, Armand | Cable splicing | UCTS fibres ready and checked. | ||
Léa, Taka, Julien, Seiya, Dirk | DATA5-upstream broken | Located between DC-PP. and IC-PP. Eric is going to have a look on Friday, when working on the other (spare) fibres. | |||
Léa, Taka, Julien, Seiya, Dirk | Found correct order of DATA1-DATA6 | We eventually found that the fibres DATA1-6 were connected in (exactly) wrong order to the camera, which lead to a mismatch of switches/modules with respect to interfaces/addresses in osaka.
This is an item for our "learned lessons": The indoor fibres had been labelled (switch-interface), but stayed in Mirca. The new fibres had been confectioned at ORM, and labels had to be "guessed" in one way or the other. |
Glossary
- CC = Commissioning Container (present LST1 Control Room)
- DC = Drive Container
- IC = IT-Container