How Big Data is Transforming the Analyst
The changing Face/Phase of an Analyst
With the advent of enabling trends such as Big Data, Digitization and Real-Time technology, the analytics landscape has been undergoing quite a transformation over the last couple of years. Previously, being armed with a mix of strong programming capabilities and statistical knowledge would have put someone in a good stead to serve as an analyst. And a good one at that!
But times have changed.
The change in landscape has created additional demands on an analyst. It has necessitated both a mindset as well as a skillset change. So what exactly does the analyst have to adapt to in this changing landscape?
To stay competitive, the Analyst needs to learn to:
- Live with Noise
- Be More Articulate
- Wear a Creative Hat
Learn to Live with Noise
In the past, as analysts, we have always been taught to ensure that we have clean data before we carry out any analysis. In fact, typically about 60-70% of the effort is spent on data manipulation/cleaning activities. In the Big Data era, with so much data around, there is bound to be a lot of noise. Unfortunately the task of data cleaning is not as straightforward.
For one, you have unstructured text with all sorts of nuances. Previously, the sources of text were pretty conventional, and thereby limited the associated writing styles. However, the text sources today are aplenty, each with their own variations. The text in an email takes a different form from that in a SMS, or on a facebook entry, or as a tweet for that matter. We try to communicate a lot, including moods :) with as few words or symbols as possible…LOL. All this just adds to the noise in the structure that we are trying to analyze.
Secondly, in the new age era, the source of data available for analysis is also one- to-many. Unlike the structured world in the past where we look for unique keys within each table to act as connectors, these days the data sources do not have such distinct identifiers. Yet there is so much value in putting these data sources together.
Let me give you an example:
Optimizing travel commute is a key problem facing many transportation service providers. It is not unusual for someone to use multiple modes of public transport for his commute. A person could start off from home on a bus, and then switch to a tube, which takes him to his work. Besides the travel time in each of these rides, the waiting time for the bus and train also add up to his total commute time. From a data point of view, there is a whole set of data on the tube travel and considerable amount of information on the bus travel. But the question is – how do we bring together these data sources to plan the bus/tube schedules so that we optimize overall commute time.
Frankly, it is not going to be straightforward. We might not be able to link the tube data and the bus data by way of unique identifiers. However, we all can appreciate that there is a lot of value in trying to connect up these data sources in whatever fuzzy way we can to make some sense out of it.
Hence the new age analyst needs to appreciate the fact that noise is pretty much an inherent part of his analytical work ahead. We can still garner a lot of actionable information without being crippled by spending an enormous amount of time to arrive at a perfect data set. Noise can be bliss after all. :)
Be More Articulate
The art of being able to tell a story out of numbers is today recognized as one of the key skill sets of an analyst. Unless an analyst is able to string together a take-away message from his analysis, it is difficult to impress upon the end user the value of his work. So what is going to change now? The analyst has to come up with better stories!
As mentioned above, the problems that can be solved with Big Data are not as straight forward as before. With more moving parts and complicated problems, as illustrated above, there is a greater skill required to be able to digest the output and share results in a simple manner for others to appreciate. This is not easy – especially justifying the ‘living with the noise’ that one can come to expect.
Further, embarking on a complex project may require more convincing since there is a greater risk of failure and more unknowns. Success requires an analyst to be more articulate.
Wear a Creative Hat
The analyst needs to learn to be more creative, more so now than ever before.
Previously the types of problems that were being solved were pretty much determined by the availability of data sources internal to the organization. However with digitization and big data capabilities, the availability of external data sources have become rampant and varied as well. Information on social media is but a simple example. Though data is out there in abundance, piecing them together to provide for a business case is not that trivial. Take for example the transportation example: all the data seems to be there, but piecing together this information to derive something meaningful requires some ingenuity.
The advent of real-time technologies has definitely added demands on the analyst to be more creative as well. Algorithms that take time to run in the background before churning out a recommendation cannot be effectively used in a real-time setting. Take for example, the market-basket analysis that looks at the co-occurrences of items and its application to a shopper in a supermarket. Imagine the situation where a recommendation is given to a customer based on the combination of items that is going into his ‘basket’ in real-time. Technically speaking, tracking the in-and-outs into a shopping cart is not impossible. However, algorithmically mining the combination of items that is going into his cart to recommend the next couple of items still requires time. Complicated algorithms will work but won’t be practical. Under these circumstances, tweaks to algorithms to make the solution ‘work’ not necessarily in the most optimal of ways but rather in the most practical of ways would requires creativity.
In as much as the demands on an analyst are becoming a notch higher, I feel that this is the Golden Era for the analyst. The advent of the various enabling trends – Big Data, Digitization and Real-time technology – has placed the analyst in a uniquely advantageous position. His background, which allows for an appreciation of both technology and techniques, puts him in good stead to not only appreciate but also contribute to business solutions. As such, I see this moment as a wonderful opportunity to seize and grow the analytics function.