Risk Analysis and Legacy Data: How to make your Big Data Defensible

Posted on March 27, 2014


Every company and government organisation has defined goals, targets and strategies for their success or growth, with established timelines for each proposed deliverable. But along the way, all types of terrible events can take place which may jeopardize these strategies and prevent the achievement of these goals and targets. We cannot prevent or foresee all possible risks, and so risks assessment as a science has evolved to determine where the biggest and most threatening of risks, signifying the largest potential impact and damage to the organisation exist.
Wikipedia defines Risks Assessment as “Risk assessment is the determination of quantitative or qualitative value of risk related to a concrete situation and a recognized threat (also called hazard). Quantitative risk assessment requires calculations of two components of risk, the magnitude of the potential loss, and the probability that the loss will occur.” Based on this definitions, there are even mathematical models to calculate the actual probabilities, magnitude and loss of calamities and disastrous events.
But there are also many cases where this is not possible, for instance in medical applications or the case of terrorist attacks or nuclear disasters, where it is impossible to measure the total impact and consequent financial damages.
Now, where can we fit the consequences of eDiscovery and regulatory investigation on this scale? What are our risks today? Where can we find our risks today? And, finally, how can we assess our risks and prevent them?
For today’s companies, risks are in regulatory investigations and subsequent civil litigation related to export control, fraud, competition violations, bribery, privacy & data protection errors, human rights, , employee treatment, environmental damages, information security, intellectual property theft, etc.
Many of these risks are captured or hidden in all the electronic data we accumulate. 30 years ago, all our communication was volatile; it was hard to capture (paper letters, paper financial documentation, analogue phone conversations, paper travel and payment trails, etc.). This has all changed: these days almost everything we do leaves some form of electronic evidence somewhere. We key in everything we communicate, think, feel, believe and expect and store it as electronic information. We share large parts of this information with others by email and social networks. Again, they may store copies of your information in their personal electronic archives as well. We keep all information for ever, as storage costs get cheaper every year.
Then, the regulator comes in … or you are suddenly involved in civil litigation and all our thoughts, intentions, objectives, conversations from the last 20 years are suddenly subject to search, review and in worst case, public disclosure. In additional, in such a case all electronic data under your custody will become part of the investigations or litigations. The more data your have, the higher the cost. This is what we call the Dark Side of you Big Data!
For this reason, risk analysis programs should also include the assessment of the quantity and broad content of legacy data. With today’s computer technology it is very easy to analyze legacy data and search for keywords or linguistic patterns related to all kind of risk, regardless of file format, content type (text, audio, image, or video) , location, or language.
Ask yourself if you really need to keep all this old data: are there regulatory requirements to keep it all? If not, ask yourself if you will ever re-use the knowledge captured in the data, – can your organization benefit in the long term from keeping the data or can it only harm you?
This is what we call Intelligent Information Governance. Apply the principles so you and your organization can make your big data defensible!