Online Exclusive

Predicting bad behavior

How advanced analytics can help to manage third-party fraud risk

“Mr. Marks, by mandate of the District of Columbia Precrime Division, I'm placing you under arrest for the future murder of Sarah Marks and Donald Dubin that was to take place today, April 22 at 0800 hours and four minutes.”

This quote is from the 2002 film “ Minority Report.” The film envisions a world in 2054, where a specialized police division, “Precrime,” apprehends criminals based on the foreknowledge of three psychics called “precogs."

Though fraud-risk teams worldwide don’t have psychics on staff (that we’re aware of), they can rely on predictive analytics to ascertain the future and assess the probability that third parties will commit fraud — or already have committed undiscovered frauds.

Organizations increasingly outsource many business activities — customer service, audit, security, facilities management, IT — but they can’t transfer their responsibilities. Regulators hold corporations responsible not only for their actions but for those of their vendors and suppliers. Look at Capital One, Discover Card and American Express. Together they’ve paid a total of $525 million to “affected customers” to settle complaints of deceptive selling and predatory behavior by their third-party suppliers. (See Managing third-party risk in a changing regulatory environment, McKinsey & Company, by Dmitry Krivin, Hamid Samandari, John Walsh and Emily Yueh, May 2013. Also see the Consumer Financial Protection Bureau Website.) Therefore, we must find ways to assess risk in the absence of full and complete access to third parties’ (e.g., business partners) internal financial information.

The past is the future, the future is the past

Predictive analytics (PA) is the branch of advanced analytics that organizations use to make predictions about unknown future events. PA includes many techniques from data mining, statistics, modeling, machine learning and artificial intelligence to analyze current data to make predictions about the future. We can use patterns in historical and transactional data to identify risks and opportunities. PA models capture relationships among many factors to assess risk and assign scores to potential future situations. Organizations can apply PA to interpret big data for their benefit.

PA allows organizations to anticipate outcomes and behaviors based on data — not on hunches or assumptions. One of the most robust PA techniques is an artificial neural network (ANN).

This is your brain. This is your brain on a neural network

ANN is an information-processing system that mimics the ways biological nervous systems, such as the brain, process information. It’s comprised of several connected processing elements (i.e., neurons) working together to solve specific problems. ANNs, like people, learn by example and feedback, i.e., learn through training.

Consider how you learn to shoot a free throw in basketball or putt a golf ball. You take a shot, see the result, and then adjust the importance you assigned to selected variables (speed, direction, line) and try again. In other words, you adjust the weights (importance) to the factors impacting your success. Eventually, your performance improves, although few of us become experts.

Similarly, an ANN has the training phase. A researcher or an analyst provides a data set that includes the desired solutions. ANN then runs and produces a solution, which it then checks against the desired output and calculates the error — i.e., the difference between its solution and the desired solution. Next, the ANN goes back and adjusts the weights of each input and tries again. This process repeats until its output matches the desired output. This process is called “back propagation” — a fancy way of saying that it works backwards from its solution and adjusts until it gets it right. The system is then considered to be “trained.”

Comparing humans and computers, the computer has two advantages:

  1. It can run millions of examples in the time it takes a human to run one.
  2. Once it learns, it can execute exactly the same way every time. It’ll never make a bad execution because it was distracted, tired, injured, etc.

An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. We can use ANNs to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A trained neural network essentially becomes an expert in whatever it’s trying to analyze. This expert can then provide projections when we give it new data.

In a simple ANN, a researcher or an analyst supplies input variables to the neural network. (See the diagram of a simple neural network.) Then, an algorithm builds non-linear functions estimating associations between all variables. There might be multiple associations (also termed as activation functions). Finally, the ANN uses those associations for predicting a desired variable (also termed as output).

The beauty of ANN is the ability to incorporate virtually all possible associations and interactions among input variables to make predictions more accurate. The drawback is the difficulty in cause-effect explanation (i.e., how any particular input variable impacts prediction). Also, ANN requires a significant amount of input data or sample size.

A simple neural network

JMP’ing and mining to conclusions

Companies committing financial fraud have some commonalities. Previous research revealed that an investigator can detect fraud by analyzing a company’s financial statements — balance sheets, income statements and cash-flow statements. (See The Detection of Earning Manipulations, Financial Analyst Journal, by Messod Beneish, September/October 1999, Vol. 55, Issue 5 and Detecting and Predicting Financial Statement Fraud: The Effectiveness of The Fraud Triangle and SAS No. 99, by Christopher Skouzen, Kevin Smith and Charlotte Wright, Nov. 6, 2008.)

You can use various items and ratios you’ve obtained from financial statements (e.g., accounts receivables, amortization, etc.) as input variables. All financial data for publicly traded companies are available through free services (e.g., the SEC website). You also can use standard statistical software (e.g., SAS JMP or SAS Enterprise Miner) as primary tools to implement neural networks for third-party fraud prediction.

Standard testing techniques can measure a model’s quality. You separate historical financial data from companies with both known and unknown fraud into two parts. You use the first part to train the model — to find associations between financial variables that distinguish fraudulent and non-fraudulent companies. You use the second part to test the model — to apply associations found in the previous step and generate the fraud score (usually between zero and 100). Companies with the score exceeding some predetermined benchmark (e.g., more than 50) are categorized as fraudulent.

Finally, you compare the companies categorized as fraudulent against real historical data. The percentage of correctly predicted fraudulent companies gives the measure of quality for the model. To date, using ANN, we’ve been able to correctly categorize fraud companies with 85 percent accuracy.

Besides financial data, you can use other types as input variables. For example, the data may be unstructured (e.g., text) as found on websites and social media. Then you apply text analytics and text mining algorithms to search for patterns or any keywords specific to known fraud cases. After the ANN evaluates text, it’ll generate the text data score and include it in the model as another input variable.

The puzzle has more than one piece

In addition to ANNs, you can use traditional measures such as M-Score and F-Score and analytic methods such as decision trees to evaluate a third party’s fraud risk. (See Value Investing: The Use of Historical Financial Statement Information to Separate Winners from Losers, The University of Chicago Graduate School of Business Selected Paper 84, by Joseph Piotroski, 2000.) And, as with all analytic models, professionals with domain expertise should evaluate the output — in this case, individuals with experience in fraud examinations and prevention as well as those with knowledge and relationships with the third parties in question.

Also, when you deal with adversarial, low-volume, preventable events (ALVP) such as cyberattacks, you must regularly reevaluate and refresh analytical models because the bad guys are constantly adapting to circumvent detection. With ALVPs, the existence of detection systems alter the nature of the risks.

According to the ACFE’s 2018 Report to the Nations organizations lose approximately 5 percent of their revenues to fraud. As such they have a fiduciary responsibility to utilize all available, economically justified fraud prevention techniques. Advanced PA techniques such as neural networks are effective, inexpensive, widely available and — with a little bit of training — easy to implement.

Allan Sammy, CPA, CGA, CIA, is the director of Data Science and Audit Analytics at Canada Post. Prior to joining Canada Post he held positions as the director, Fraud Risk Management at the Ontario Lottery and Gaming Corporation and as a forensic accountant with Deloitte. Prior to joining Deloitte he was a commercial crime investigator with the Royal Canadian Mounted Police. He has a master’s degree in predictive analytics from Northwestern and has more than 20 years of experience in risk management, investigations and internal audit.

Vladimir Yasenovskiy, Ph.D., is a manager of Business Analytics at the Ontario Lottery and Gaming Corporation (OLG). He has a doctoral degree from University of Alberta and more than 10 years of experience in applying predictive analytics for the gaming industry. Prior to joining OLG, he worked in analytics at the Alberta Gaming and Liquor Commission. Contact him at