Featured Article

Blazing a trail for the Benford's Law of words, part 1

Analyzing letters of the alphabet

Fraud examiners have used Benford's Law for years to detect anomalies and possible fraud in organizations' ledgers and journals. Now the author is researching a method to analyze letters of the alphabet in business documents — "Letter Analytics" — to find fraud patterns. His new approach could expedite fraud examinations.

As fraud examiners, we analyze three types of data: numbers, dates and text. We've been trained to "follow the money" (those number fields), which is sage advice because it's the simplest path to identifying positive returns for the organizations that employ us. However, when we look at the world of information — be it written, spoken or on the Internet — we need to move from our habitual desire to look only at the 0s and 1s and realize that the majority of data is text. Otherwise, our scope of examination can be missing 50 percent to 85 percent of available data.

Imagine for a moment an accounts payable audit in which we obtain a vendor masterfile, a purchase order file and an invoice payment table. As calculated below, the textual data averages to nearly 70 percent of the data received for examination (85 percent + 70 percent + 50 percent / 3) with the "money"/numeric fields comprising nearly 10 percent of the data:

  • The vendor masterfile — 85 percent text — is filled with names, addresses, vendor types, etc., some vendor reference numbers and D-U-N-S number from Dun & Bradstreet, and dates usually for vendor creations and inactivations.
  • The purchase order file — 70 percent text — has a quantity, price and extended amount field (the "money" fields), reference numbers for transactions; some dates associated with the opening and closing of purchase orders; and textual descriptions of the purchased products, departments that purchased the goods and people who entered the purchases.
  • The invoice payment file — 50 percent text — has fields for invoice payment, check payment, discount and tax amount fields (the money fields); reference numbers for transactions; some dates of invoices and checks; and textual information such as the suppliers' invoices, invoice descriptions, payment notes, departments that purchased the goods and people who entered the purchases.

I can understand why organizations don't analyze most text fields. They can easily and simply handle numbers via calculators and plug-and-play 24/7 analysis programs. But they need complex tools and approaches to slice and dice textual data and refine the inherent knowledge.

Let's look at some of the approaches we utilize in audit software by data type in Figure 1 below.

(Figure 1: Comparison of data types.)

Many of you deftly use software for analyzing numbers and dates, but textual analytic tools probably aren't your standard tools. You then have to purchase those applications and hire a trained technician to use them.


For full access to story, members may sign in here.

Not a member? Click here to Join Now. Or Click here to sign up for a FREE TRIAL.