|Page (1) of 1 - 11/14/12||email article||print page|
Optical Archiving in the CloudSustaining the Data Monster
Online applications such as Facebook and Twitter, paired with smart phones, tablets and other mobile devices have given us everywhere access to our content and content sharing capabilities. They have also enabled wonderfully convenient solutions to help us communicate and stay connected whenever and wherever we decide. But they have simultaneously created a multitude of nightmarish issues for the IT managers that are responsible for the care and feeding of these data monsters.
Because of appealing characteristics such as low cost/GB and anytime/anywhere access, the internet is being used more and more to store data and run applications, replacing traditional tangible servers with private, hybrid and public cloud computing. As a result of this trend, Gartner has predicted that more than one-third of global digital content will ultimately be stored in the cloud by 2016.
Key Benefits and Problems with the Cloud
There are many solid reasons to consider implementing a cloud-based data archive, including the ability to pay only for what you need to use, rather than being forced to upgrade in large capacity jumps as you add hardware to the overall archival system. In addition, because the archive is stored and maintained outside of the organization, fewer in-house IT staff are required and other associated data warehouse costs (including real estate, electricity, A/C) are reduced. Finally, implementing cloud-based data archive services provides authorized personnel with remote access to stored data and applications, regardless of where they are working. However, there is a downside. Some key problem areas to consider before porting over all of your organization?s valuable data resources to the clouds include availability, security, performance and compliance.
Outages happen, everywhere. However, the effect of a power outage on a large cloud service company can impact hundreds of businesses and thousands of people. Although Amazon Web Services has solid contingency plans and a superior staff, even their services were interrupted four times in the 14 months spanning from April 2011 to June 2012 raising the question: if 100 percent cloud access isn?t possible, what percentage of downtime can you afford? The June 15th outage was traced to the failure of a generator cooling fan while the facility was on emergency power following a series of failures. It?s no secret that the power requirements of these behemoth facilities are already over-taxing available power supplies. And, as our data requirements grow, power sustainability (and therefore data availability) becomes key issues that must be addressed.
Security is another ongoing concern. When using a public cloud service, companies must balance the competing factors of control, visibility and cost. There are services that offer private resources for added data security (ensuring only your data is stored to a section of hardware rather than having hardware space shared with another client); but even these services are not completely secure. In many cases, the hardware being designated to the client is not new, and therefore, not completely scrubbed clean. Often, the hardware has been previously used as real servers, or hosts for as many as 16 virtual servers at a time, meaning each of those servers, and each client who passed through the system, could leave sensitive data behind and available for accidental (or even malicious) recovery.
Before you think, ?not my problem,? what happens to ?your? hardware when your data gets ported over to new hardware or you change cloud services? If Company XYZ?s accounting information was left behind when you took over that private cloud server, what sensitive data are you leaving? And before you decide to just encrypt the data you store to a cloud provider, make sure your provider can handle it. In many cases, encrypting data can cause even more problems, showing up as garbage data rather good, and therefore not properly managed by the provider.
It?s also critical to realistically consider your organization?s performance requirements. Utilizing cloud services ultimately means handing over a lot of control, and when a performance issue arises, you won?t be able to troubleshoot and find server-related root causes, you will have to wait. As is true with any network, the lack of proper resources inevitably leads to poor application performance. Meager performance is generally the result of an application architecture that does not properly distribute its processes across available cloud resources. Performance is also impacted by limited bandwidth, disk space, memory and CPU cycles as well as latency caused by poor network connections.
Cloud compliance has its own set of concerns. Beyond privacy and industry compliance (HIPAA, FERPA, PCI-DSS, FIPS and the like), there are geographic compliance issues as well. To keep costs down, many Cloud Service Providers (CSPs) maintain their public cloud systems on international soil, where the rent and other associated costs are cheaper. However, regulations that ensure privacy in one country, often conflict with regulations requiring disclosure in another. Before contracts and SLAs are signed, know where your data and processing will occur and ensure that you are doing everything you can to meet all of the regulations in those locations to reduce future setbacks.
Exploding Data, Imploding Budgets
The volume of data being created is doubling every two years. That?s not news. However, the fact that the digital universe is expected to reach 7.9 ZB in 2015 (from 1.8 ZB in 2011) is worth consideration. And, while the explosive growth in digital information continues to demand more efficient archive strategies, reduced IT budgets is another solid reason organizations are considering cloud services. Nevertheless, while it may be possible to provide the data migration, air-conditioning and power necessary to support data volumes today, it will not be possible in 2015 unless there are some dramatic changes made.
For a little over a decade, the cost of storing data has always been reviewed, compared and considered in terms of hardware and cost per Gigabyte. However, as data volumes have grown, it has become increasingly more obvious that the real cost isn?t capacity, it?s operational. Total cost of ownership (TCO) estimates must include expenses such as the cost of electricity to run the hardware, the cost (and environmental emissions) associated with cooling the warehouse-sized rooms full of servers necessary to store today?s data volumes, and the personnel costs associated with long-term storage (maintenance, management, replication, backup, hardware replacement and data migration).
While it is true that migrating to a CSP can take many of these burdens off your corporate shoulders, it doesn?t solve the problem. As mentioned earlier, overheated servers (whether yours or in the cloud) will go down, taking data access with them. Current power and cooling availability are at capacity limits, but our data
continues to increase. Consequently, it?s time to take a long, hard look at how we use our data, and how much of that data we really need to access instantly.
When researchers discovered that YouTube viewers decide within the first 10 seconds whether or not they will watch a video in its entirety, it not only changed how marketers organized their content, it changed mobile cache settings from 30 seconds to 15 saving time, capacity and ultimately, money. The same is true for our stored data. We think we need to have instant access to every byte; but the truth is, at least 80 percent of an organization?s stored data will seldom (if ever) be accessed. Of course for compliance, all data must be stored and protected, while still maintaining reasonably fast recovery; but in most cases, an organization has hours to even days to produce files from long-term data stores. This 80 percent is ideally suited for data archive to a secure, scalable, low-cost (hardware and operational) solution.
Blu-ray ? A Solution that Fits the Complete Archiving Picture
Today?s requirements for long-term data archives are about more than simply where to put the data. Considerations include assessment of accessibility requirements. Defining when data needs to be online and searchable and whether or not standard access protocols are adequate, given the data being archived. Cost is another strong determinant; however, depending on the storage solution, total cost of ownership (TCO) can have a long list of line items to tally. Ideally, the solution you choose will eliminate backups/migrations and their associated costs, be easy to manage, energy efficient, and of course, have a low TCO.
In comparing 100TB of the top three data archive technologies (traditional hard disk drives (HDD), tape and optical (BD)), over a projected 20-year archival plan, BD emerges as the stronger contender with fewer compromises.
Power consumption requirements per HDD hover between 1000-2000W + A/C, making it easy to see why CSPs and large data centers are struggling to keep systems up and running. The power requirements of HDD are as much as 25 x more than BD, in a large part due to the fact that BD can be powered down when data is not in use. In addition, BD drives do not require 24 x 7 electricity or A/C to keep the hardware cool as is necessary with both HDD and tape technologies (which must be constantly on and running whether data is being accessed or not). As a result, the annual saving in electricity over 20 years can be in the range of millions of dollars.
Because long-term data archiving is hardly a ?set it and forget it? strategy, hardware longevity is another significant consideration. Over time, media and drives fail; and consequently, preventa-
tive data loss measures include preemptive replacement of both. The more frequently media and drives have to be replaced, the higher the cost, not only in hardware, but in human resources for integration, data management and migration, in addition to the higher the risk of data loss during each migration. Although specifications indicate HDD and tape should be replaced every 3 and 7 years respectively, most IT specialists replace their drives every 2 and 5 years. By comparison, BD drives can safely archive data for 30 years or more without needing replacement and without generating mountains of technology trash. It also frees the IT staff for bigger corporate challenges than merely swapping drives.
For many organizations and CSPs alike, moving to a RAID/Blu-ray hybrid archive is the ideal answer to harmoniously balancing data access and storage requirements with conservative budgets and environmental conscience. Not only can such a hybrid ensure fast access to younger (expected to be accessed more frequently) archived data, it still enables organizations and CSPs to take advantage of as much as a 40 percent reduction in both power consumption and CO2 emissions over a standard RAID solution--important considerations for those struggling to meet capacity, access and EPA requirements.
As data mining continues to mature, offering potentially major benefits in virtually every industry, the yield here too is vast amounts of data. In some cases, there may not be time or opportunity to fully glean every bit of data gold. In other cases, a quantity of like data may be necessary to adequately establish a conclusion or trend. However, the economy of a RAID/Blu-ray hybrid archive means the information can be efficiently and cost-effectively retained, mined, reviewed and retained again, indefinitely either privately or in a cloud somewhere.
Online applications, the integration of personal mobile devices into the workplace, and compliance regulations will continue to ensure healthy growth of the data monster. However, data archive solutions that include Blu-ray can help to wean the monster off of its insatiable demands on the local power grid and reverse the overall impact on the environment for a more sustainable ecological and economic future, and a greener cloud.
Yasuhiro Tai is General Manager,
Related Keywords:cloud storage, archiving, data center