Home » Technology » Storage » How to transform Big Data into Big results

How to transform Big Data into Big results

Tech Page One

Looking beyond the hype: Big Data delivers real benefits

How do you define big data?


The world of IT perennially invents annoying buzzphrases, often put together by marketing teams, leaving the IT departments to wonder whether there’s substance behind the terminology.

Big Data is one of those: it sounds almost reassuring and cosy, like a big blanket but the comforting nature of the phrase is rather misleading – this is hard-edged business decision making at its most frenzied.

For there to be real business benefit, companies have to be convinced that Big Data is more than just hype. This hasn’t been the fastest process – there were signs in the early stages that organisations were not investing in the technology but that has all changed.

Big Data investment

According to Gartner’s Frank Buytendijk (in a report published in October 2015), the hype is actually masking what is really going on. “While everyone is talking about big data, adoption is simply happening across every industry and business function. According to a recent Gartner survey, 64 percent of respondents are already investing in big data or have it in their plans over the next 12 to 24 months.”

With two-thirds of businesses looking to go down that route, there’s a genuine reason to think that the technology is here to stay. But it’s still unclear as to what exactly Big Data means – it’s a phrase that still causes some head-shaking.

Big Data suffers from the same handicap that cloud computing did in the early stages of its existence: the lack of a standard definition for the technology. It’s easy to think of it as another name for a large dataset but this is altogether too glib. A more realistic definition would be the gathering and analysis of a mixed set of data, structured and unstructured, to develop a cohesive business strategy.  It’s the mixture of sales figures, the social media comments, the You Tube clips, all combining to deliver more accurate results.

The three Vs

Commentators on Big Data use the phrase “the three Vs” to describe how it works. The “V” stands for velocity, variety and volume – in other words, the speed with which the data can be accessed and interrogated; the different types of data being examined and the size of the database. Some commentators add a fourth element, veracity, but it should be taken as given that your data is fair, anyone dealing with inaccurate figures is going to be in trouble no matter what the data is.

For businesses, we see the benefits when the Internet of Things comes into play. The growing rise of connected devices is going to produce an overwhelming amount of data, with a mix of structured and unstructured. This is not just an issue facing particular industries, although some will be more affected than others, but will be a problem hitting any organisation that has to handle large volumes of data.  It doesn’t necessarily have to be from customers, local authorities, for example, are increasingly gathering more data on their citizens for better provision of local services.

Working with healthcare analytics

Big data technologiesTo take one example of how the world is about to change, let’s look at the healthcare industry and the way that connected devices have changed the way that it operates – and the implications for handling data.

One of the key problems that doctors have faced is patients’ propensity for being less than truthful when asked questions about topics like alcohol consumption or exercise taken – the adage has always been double anything that a patient says about drinks consumed. The arrival of portable devices has changed all that: doctors are now able to monitor how many drinks have been taken, how many steps have been made and so on.

Not only that, these devices can be used to keep tabs on medical conditions that are not self-inflicted –  heart-rate and diabetes monitors for example.  There are also monitors that can tell whether a patient has been taking his or her medicine or not: all vital information.

The real power comes when the information from these various monitoring devices is combined with the plethora of medical tests that have been carried out as part of regular check-ups: X-rays, blood tests for example.  All these records can be combined to build a valuable picture of an individual’s health.

There’s a problem with this information, however, much of the data that has been gathered is sitting within doctors’ surgeries and there’s no easy way to share that information.  It’s the textbook example of how Big Data can help: data from a variety of devices, in a variety of forms can be accessed to provide several pointers for medical staff, both for research purposes and for guiding health managers where to concentrate individual services.

One of the industries that is already making heavy use of Big Data is retail. All the major supermarket chains would love to know what their customers are buying and, more importantly, what they’re going to buy.  Again, decisions are not being made solely on sales figures but by tracking social media and even by using store video cameras to observe how customers are looking at goods.  By gaining more data from customers, retail organisations are able to make more informed decisions in purchasing, displaying and promoting goods. According to McKinsey, organisations using data-driven-decisions are 5 percent more productive and 6 percent more profitable than competitors who don’t”.

But Big Data is more than just a means to improve decision making within organisations.  According to Gartner’s Buytendijk, fewer than half of Big Data projects focus on decision making but, instead, help business to look into new processes: these include work on

  • Marketing and sales growth
  • Operational and financial performance improvement
  • Risk and compliance management
  • New product and service innovation
  • Direct/indirect data monetisation

So, there are plenty of advantages in following the Big Data path, the main question, therefore, is how does an enterprise prepare for this?

The starting point is always going to be the datacentre. The first decision is whether you’re opting for some form of cloud-based approach. Be aware that you’ll be dealing with ever greater volumes of data so your system will need to handle this. A public cloud deployment will be more scalable and able to cope with the peaks more easily, but if there are security concerns about public cloud, then a hybrid approach is probably the way to go.

There will almost certainly be a requirement to bump up enterprise storage as the volumes of data will increase.  Cisco estimates that the data created by IoE devices will be 269 times higher than the amount of data being transmitted to data centers from end-user devices and 49 times higher than total data center traffic by 2019. The company also predicts that these devices will increase to 507.5 zettabytes by 2019, up from 134.5 ZB per year in 2014.

Accordingly, you will need to tackle the infrastructure first: you will need to increase the compute, storage and networking capability within your organisation. But that’s only part of the story in preparing for Big Data deployment. There’s also a requirement to upgrade your software to handle the change.

You will need to look at three key areas: advanced analytics, data integration and data management.

Advanced analytics

This is at the heart of Big Data deployment. By using advanced analytics software, you will be able identify customer preferences and anticipate behaviour. There are also security implications as you will be able to use the software to sport transaction patterns that could indicate fraud.

Data integration

This is where you look to tackle the range of different sources of data and bring them together. The healthcare company that merges doctors’ notes with data from connected devices; the department store that integrates sales data with social media chat and video footage – it all counts as data to be examined. This integration should also take into account the reality that data could be pulled from on-premise and cloud deployments. Not only is there a need to handle data from a multiplicity of sources but it needs to be handled quickly – remember that velocity is one of the defining characteristics of handling Big Data. The faster information is delivered to an enterprise, the faster it can react to it.

Data management

By definition, Big Data systems produce a lot of data and there need to be robust systems in place to store, backup and secure the data produced by your systems. This needs to be done in a cost-effective way, you don’t want the cost of the administration outweighing any financial benefits.

Data warehouses – there’s still life in the technology

There is a school of thought that the emergence of Big Data means that companies no longer need to maintain a data warehouse.  This is misleading: what’s more effective is to build on your existing data warehouses rather than lose the investment in existing technology.  For example, Dell and Microsoft have collaborated on building a Data Warehouse Fast Track solution making it easier for companies to leverage their existing investment in this area.

So, companies have plenty of ways to introduce Big Data analytics into their corporate environments but the over-riding question is whether such an investment is worth it?

The customer view

The use of Big Data can literally be a matter of life and death.  The University of Iowa (UI) Hospitals and Clinics, uses the technology to make real-time predictions about the probability of a patient developing a surgical site infection. By making informed judgments about who was susceptible to such infections, surgeons have been able to reduce infection rates by 58 percent.

While the appearance of surgical site infections sounds like a throwback to the Victorian age, it remains a serious problem, causing long term hospitalization and sometimes, tragically, death.

The director of the division of gastrointestinal, minimally invasive and bariatric surgery, Dr. John Cromwell, predicted that a high percentage of surgical site infections were preventable by using predictive analytics.

It had the dual effect of not only saving lives but reducing costs too – the hospital calculates that post-surgery infection was already costing the healthcare industry $10 billion every year.  The deployment of real time predictive analysis has greatly reduced that cost.

“Using these tools and other methods, we’ve been able to reduce surgical site infections by about 58 percent,” says Cromwell. “That’s a revolutionary concept in gastrointestinal surgery.”

To achieve this, the medical team has had to take a new approach and look more closely at data that would not normally be considered as part of the recovery process. “We’re able to take information from electronic medical records (EMRs) and other enterprise sources, including real-time data from the operating room, to determine whether patients are likely to get a surgical site infection,” adds Cromwell.”  This allows us to modify and individualize the type of care that we’re delivering in the operating room.”

This change of emphasis has entailed a change in the underlying infrastructure. The medical team will require more processing power as well as a new set of analytic tools.   Jose Maria Monestina, senior application developer at UI Hospitals and Clinics, was given responsibility for implementing this technology.

This was not a straightforward task, the records ran on a variety of different systems and were held on disparate databases, Monestina had to bring all this data into a common data set with embedded analytical tools. “This process has allowed us to deliver predictive analytics in a real-time environment to improve healthcare and reduce costs,” says Monestina.

The problem with this approach was that the hospital existing infrastructure could not support the increased volumes of data that the medical team was now handling, both in terms of storage and in delivering to clinicians.  The hospital therefore had to upgrade to more powerful system, without losing the ability to work with the existing IT architecture.

Dell has delivered on that and has changed the way that the team works. It now uses an analytics platform that aggregates the data, prepares it for modeling  and then deploys that model. “It’s all in one package,” explains Monestina. “You can store the data model in a server and then reuse it. You can share the data models among different persons within your research group.” Part of that deployment is the power of mobility. You are not bound to a specific PC or a server. You can run those models using a mobile application or a web browser and access the results.”

Everything has now changed. The surgical team can now do more than just analyze disparate data (EMRs, registry and patient satisfaction data), the team can also merge it with live patient data in the operating room to make data-driven decisions about individual treatment

“Big data and predictive analytics are transforming outcomes at virtually every point in patient care,” says Cromwell. “We see so many other areas where this could be useful, including drug delivery, population health, managing patient flow and every other aspect of medicine that allows us to deliver high-quality healthcare.”

To learn more about the potential of Big Data, click here.


Maxwell Cooter

Maxwell Cooter

Max is a freelance journalist who has covered a wide variety of IT subjects. He was the founder editor of Cloud Pro, one of the first dedicated cloud publications. He also founded and edited IDG's Techworld and prior to that was editor of Network Week. As a freelancer, he has contributed to IDG Direct, SC Magazine, Computer Weekly, Computer Reseller News, Internet magazine, PC Business World and many others. He has also spoken at many conferences and has been a commentator for the BBC, ITN and computer TV channel CNBC.

Latest Posts:


Tags: Storage, Technology