With Big Data comes Big Analysis

People love the idea of live dashboards. These magically deliver insights previously only available after months of analysis of stale data; suddenly the information's visual, and easy to understand even if you're not a data scientist. The reality however is that reporting dashboards are often basic (yet pretty) graphic representations of rather simple data. The reason is not lack of imagination but rather the technical complexity involved in delivering real time analysis of the massive number of events currently taking place in Big Data environments.

Just to give you an idea of the amount of data we're handling: at Plexure's peak the platform processes around 8,000 activities a second for over an hour at a time. That's a massive 43GB of data per hour, all of which can be analyzed to personalize shopper experiences. By comparison, during the 2014 Christmas Eve rush, NZ payment network Paymark processed 155 payments per second (which really just goes to show transaction data is a tiny part of the data you could, and should, be collecting on customers). In a nutshell: way too much data to dump in a spreadsheet.

Technology solutions such as Hadoop provide distributed computing to crunch through this massive data but the results are still far from real time and take up massive amounts of processing and cost, even on full cloud implementations.

Microsoft's solution to this brute force approach is Microsoft’s Stream Analytics, which analyzes the inbound stream of events in real time rather than making calculations based on already stored data. This means live running totals can easily be continuously updated without the processing cost of retrieving millions of data points from different storage systems and then running filters and counts based on that data. Microsoft PowerBI can then be connected to this summarized view of data to display the live metrics to business users.

This technology is not designed to replace historic analysis of stored data but can take over a significant amount of heavy lifting especially when it comes to live dashboard information. Compare the ease of understanding several thousand rows of tabulated spreadsheet data with a few (automagically generated and updated) graphs, and it's not hard to see why operations and C-level staff are so into dashboards when it comes to BA of BD for BI...