Automatic Fraud Triangle Analytics made possible with Text-Mining and Content Analytics

Posted on January 21, 2014


Economic crimes such as corruption and fraud are difficult to detect and prevent, but the financial and reputational consequences and the growing public and political demand for harsh action on corporates whose employees break the law are forcing companies to review their security and compliance policies to limit the extent to which fraud can take place.
In many companies, fraud is more often detected more often by anonymous tips or by accident, than through pro-active internal audits. One of the challenges is the complexity of “Big Data” (see next chapter) and the fact that almost 80% of enterprise content today, is unstructured and therefor seems hard to examine.
In his Fraud Triangle criminologist Dr. Donald R. Cressey identifies three components that are present where fraud exists: (1) incentive or pressure, (2) opportunity, and (3) rationalization together form the three angles of the so-called a fraud triangle. Breaking this Fraud Triangle is the key to fraud deterrence and implies that an if an organization must removes one of the elements in the Fraud Triangle, in order to reduce the likelihood of fraudulent activities is highly reduced.
Most anti-fraud activities only focus on the 20 percent of structured data, mostly financial administrations and ERP systems. However, structured data is only 10% to 20% of all data. But often the components leading to the identification of the angles of the triangle are hidden in the vast volume of unstructured data, formed by e-mails, user documents, presentations, and web content.
The Fraud Triangle Analytics (FTA) supports corporate security officers and internal auditors with internal investigations on their Big Data collections to prevent and detect fraud as early as possible. Big Data can be researched, extracted and presented in a transparent structure so the results of the investigation can be used to automatically detect potential fraudulent activities and prevent these in the future. In addition, by using the right technology, one can also properly deal with confidential, privacy or privileged information and data protection concerns.
By combining Fraud Triangle Analytics with text-mining and content analytics, indication and evidence of incentive, pressure, opportunity, and rationalization, can be detected by using keywords, but also by looking for specific lexical, syntactic and semantic patterns which indicate possible fraudulent activities. Modern eDiscovery platforms such as ZyLAB’s can recognize these patterns in almost any kind of data, regardless of file format, location or language and stored in a database.
By using simple analytics and reporting tools on top of this database such as MS-Excel, the three dimensions of the fraud triangle can be analyzed easily in relation to time, custodians, projects, locations and other meta-data selections.
Depending on certain user-defined thresholds, one can then automatically trigger alerts to internal investigators who can investigate the identified facts and validate the potential evidence and indications of (upcoming) fraud from these unstructured data and fraud triangle analytics and take (pre-emptive) action if needed!
On the ZyLAB Website, a comprehensive white paper can be downloaded with more details on how to use text-mining for implementing fraud triangle analyses.