Date: Fri, 17 Oct 2003 17:36:54 -0700 (PDT) From: Philip Charlton To: RDS MDC List Subject: MDC finished Hello, this message is to close out the RDS MDC. I'll provide a more detailed report soon, but here is a brief summary. Data reduction at LHO and LLO went reasonably well. There were a few minor bugs found in the scripts for creating RDS data, which were fixed as we went along. Many of these were to add more information to the logging. As of the most recent restart of the scripts (2pm Tuesday Oct 14 until today) there have been no further unrecovered errors or bugs identified. Reduction rates were at the sites were better than real time. With two parallel threads, the rate of Level 1 reduction at LHO was better than 1.7 to 2 x real-time for most of the run, settling down to about 2.18 x real-time after a number of jobs had succeeded. Level 2 and Level 3 were about 10 and 11 x real-time, respectively. At LLO, there were initially some problems with the rate. For the first couple of days we were only seeing reduction rates of 0.5 x real-time. This was found to be caused by mounting the filesystem with the -directio (direct I/O) option, which makes the archive filesystem faster under most circumstances but appears to slow down the frameAPI immensely. This was also found to be the case on the CIT system, but not on the LHO system where the flag was not set. After removing this option, LLO Level 1 reduction ran at about 3.3 x real-time, while Level 2 and Level 3 were about 8 x real-time. At CIT, after removing the -directio flag, data reduction for LHO Level 1 was around 1.7-2 x real-time and LLO Level 1 was round 2.8-3.3 x real-time. I also tested Level 2 and Level 3 reduction at CIT and these were about 10 x real-time. There were no bugs or problems found with LDAS that are an impediment to producing RDS. On the LDR side of things, we had decided to use a partially updated version of the package. In the useage of LDR for the S2 data, it had been found that the older version of the GLOBUS toolkit used by LDR at the time was the most labor-intensive component. It required a significant amount of maintenance to keep it working smoothly. The new version of LDR will use a more recent and much improved version of GLOBUS, but since it wasn't ready at the start of the MDC we decided to use the original LDR with the new GLOBUS. Scott provided us with an installation package for Hari to install on Tuesday morning (Oct 7). Unfortunately there were several minor problems with installation and configuration of the systems which held up commencement of LDR testing until Friday. Once they were cleared up we were able to begin publishing Level 3 data to the catalogues at LHO and LLO, and Level 1 data at CIT, and Level 3 data began moving from LHO (but not LLO, see below) to CIT, and then to UWM, MIT and PSU. The last information I had from Steffen at AEI was that there are still LDR configuration problems which were preventing him from getting data. He and Scott are in communication about fixing that. Mike Foster, Keith Bayer and Scott Koranda each reported that Hanford Level 3 data had been successfully copied to their sites. Data rates were around 1 to 4 MB/sec (the total LHO and LLO AS_Q-only data is 0.1875 MB/sec). We haven't been able to copy any data from LLO to CIT because firewalling has blocked several ports required by LDR. Hari and Shannon made some progress with this problem on Monday (Oct 13) but it hasn't been completely resolved yet. Hari is currently attending the IVDGL meeting in Chicago and will finish with that when he returns. Unfortunately the MDC has not met the goal of directly demonstrating that Level 3 data can be gotten to the Tier 2 centres within a few hours of being generated, and have not gotten data from LLO to the CIT or the Tier 2 centres at all, which is the pathway that most requires testing due to the bandwidth limitations to LLO. However, the testing so far shows that all the components (RDS prodcution, LDR publication, LDR propagation) are individually capable of propagating L3 data within a few hours once the final configuration issues are sorted out. Scott and Hari are coordinating to do some specific testing of this mode of operation next week during E10. We plan to contine running through E10 to iron out the last couple of problems. In summary: * the data reduction procedure at the sites and CIT seems to be in order and running respectably faster than real-time * LDR is happily communicating with LHO -> CIT -> UWM, PSU, MIT, but not AEI * No showstoppers found in LDAS * Hanford Level 3 data was copied from CIT to UWM, PSU, MIT at a rate better than real-time. No information yet on what the total rate would be from LHO->CIT->TierII centres since LDR was started so late in the MDC that it spent all its time catching up. * There is no LDR communication between LLO and CIT due to firewalling. This is the highest priority task for Hari when he returns from the IVDGL meeting. * No showstoppers found in LDR (not counting the above problem, since that's not an LDR fault) * Data reduction and propagation of Level 3 data will continue through E10 (starting tonight), with the goal of seeing how quickly we can get Level 3 data to the TierII centres * Propagation of Level 3 data within hours of creation from either site has not yet been demonstrated. Will be set up to that by Tuesday of next week. Thanks to all who participated. Philip