RPS MetOcean ~~~~~~~~~~~~ Hi, my name is Greg Orange, from RPS MetOcean in Perth. I'm here to tell you all something of our experiences implementing a new storage system, with DMF as the centrepiece. Our IT department comprises four people and we've all been involved at various stages of the project. I've been primarily responsible for implementation and configuration. To give a bit of background, RPS MetOcean is a consultancy which provides oceanographic and meteorological services in support of coastal and ocean engineering and environmental protection. Our major business is around physical oceanography, along with some marine and local, land-based meteorology. For over 25 years we've been collecting, analysing, interpreting and helping others apply metocean data. //Our main focus is on: //* Oceanographic Measurements; //* Metocean Monitoring Systems; //* Coastal and Ocean Engineering; //* Environmental Consultancy; and //* Data Management Our data is one of our biggest assets, something long recognised by senior management. This means that when we come to make strategic decisions about data storage, we're confident that we'll have support to choose the right technology. our data ~~~~~~~~ Our data covers a few broad types: typical office documents these don't require any special treatment, and are fairly low volume metocean data stored in binary netcdf format, from both measurement and computational modeling programs measured: converted raw instrument data, initially in a variety of formats from our own instruments and 3rd parties (BOM, Aust Hydrogr Service) modeled: mostly output from HPC cluster, 100 cores source code for our data analysis tools, scripts, configuration data, model parameters aggregate data management using both FS and DB, refer Jason Lohrey TODO: why do we offer generic shares on major filesystems? staff expectations rates of growth (intro, chart, note the ceiling, just /jobs!) staff are getting pretty good at informing us of incoming volumes, making it easier to plan permanent retention, retrieve reasonably fast previous system ~~~~~~~~~~~~~~~ (old_architecture) combined application and fileserver separate backups, archive, main storage IO and CPU contention, high load, limited scaling options EMC SAN array grew from 1TB (double requirements) in early 2004 to 16TB (bulging) in late 2009 still happy with performance, but scalability and cost have become prohibitive choose ~~~~~~ spent a few months evaluating options had to start by identifying what we wanted to achieve. Unfortunately incomplete before we spoke to potential suppliers, so process took longer and some suppliers who probably should have dropped out sooner, didn't sideways scalability - more servers accessing same storage rather than adding entire storage units large capacity, but not all fast and expensive i.e. HSM permanent retention, disaster and oops protection no other options suited quite as well as DMF, which is still not a perfect fit, but we still don't know whether that's SGI or us. (SGI_soln_diag) The decision to purchase a system from SGI was mostly based on the completeness, maturity and integrated nature of the solution. The other vendors were either offering solutions comprised of several disparate systems, incomplete solutions or solutions that didn't match requirements. dealing directly with people who know the gear during this time we thought we had a reasonable handle on how DMF operated, and looking back at various times, some of us did. but our understanding slowly changed, sometimes for the poorer (: buy ~~~ choosing took a lot of time and effort, but nearly eclipsed by getting approval for purchase! this was a frustrating time, because tsunami started crashing hard kept telling people "it's coming" so they weren't annoyed with us, but still not good service lead into delivery (pingpongClear, pingpongClutter) site install ~~~~~~~~~~~~ some challenges with delivery due to stairs we were very gung ho about DIY, but I later learnt from my Dad that during his 35 years with IBM, they did the full delivery saved by magnificent rope setup of which i wish we had photos (stairClear, stairClutter) tight-fitting server room (tight*.jpg) susheel gokhale & paul templeman did all the initial install & config very efficient and effective. the resulting layout and cabling was as neat and functional as we want our whole room to be we learnt a lot of system components during installation & training fortnight, but there's nothing like experience, which was to come... post-install ~~~~~~~~~~~~ TODO shrink very soon after the install, replacement SATA PSUs paul templeman came onsite and helped; simple, painless lots of hands on learning about all ISSP componennts DMF does _NOT_ act as a three tiered capacity extension. SATA is for cache while our staff are getting used to data being nearline, we want all active data on spinning disk we're hoping to relax that config later forgot to configure XVM failover config on application server for a while no real problems, just suboptimal performance and an annoying amber light for performance reasons we've configured our RAID volumes in a far less flexible fashion than before no defrag on the LSI SAN, so had to delete any RAID volumes beyond space we wished to recapture data migration without impact on ailing existing server star (ACL support?) rsync (no ACL support) bacula! cxfs bug causing kernel oops patch released and installed within three weeks of detection very happy with SGI's response configure ~~~~~~~~~ not bad, felt hard at the time cxfs filesystems posed some difficulty dmf we are still deciding on best migration thresholds 2.8m of our 3.2m files are under 256k! inode quotas excellent for our netcdf files, because the first 32k or so can be kept on disk and some basic nc tools only touch that offsite data is important to us hard delete period, tape juggling, susheel's useful dmmove migration suggestion we want all data offsite permanently, safe our solution: 3 tape pools. one onsite, one offsite, another offsite for when first offsite needs to be sparsed susheel's solution: 2 tape pool. one onsite, one offsite. when sparsing, just query the catlog DB for all BFIDs of files with data blocks on the sparse tape. dmmove to delete those files from offsite tapes, then dmmove them back to offsite tapes from onsite tapes short period of only one copy, can be mitigated with a dmmove to a temporary location. we may use this breakdown of tiers - based on existing figures, but will require fine tune backups tina onsite and offsite incrementals uncertainty current error with the bapi_fs tunable: backs up everything ): backup window similar to Lachlan (?), tina & openvault edit /etc/init.d/tina.tina to remove qc_stinit reference changes ~~~~~~~ all space (in /jobs) avail, need tighter project management of storage we might even introduce quotas CXFS: people can no longer walk the entire fs tree with their cron jobs relieved to note that expansion is now an order of magnitude lower cost, which nicely offsets our increasingly difficult budgetary arrangements growth less flexible on the top tier, but trivial on the tape tier any sort of growth is now made possible, no longer will we scrabble about for creative ideas to expand backups are still a concern future ~~~~~~ HPC growth more tape capacity to better harness DMF's strengths train other sysadmins in administering the system invent new archive system probably move data to different filesystem, change permissions expecting SGI to provide good future support, in fact as I wrote this presentation I received an email from Susheel identifying another issue with an accompanying fix. keep track of stats, thanks Rob M - justify, marketing to colleagues valuable get input from you guys, mail list