Fraud rings, possibly involving Scott Peterson, are accumulating more than $10 billion in cash, and they are about to focus their efforts at your P&C carrier.

For the last six months, we’ve been warning about the massive unemployment fraud epidemic.

The amount of money that is being pulled in by international fraudsters is very likely north of $10 billion dollars.

Because these fraudsters have been focused on unemployment insurance, they may have provided breathing room for P&C insurance companies as they have focused elsewhere.

But, make no mistake, when the unemployment insurance fraud dries up, they will focus their attention on traditional insurance companies. And, if you study how this unemployment fraud occurs, you see that although they may be headquartered outside the USA, they operate inside the USA with an army of foot soldiers.

Here is the very bad news. They have pulled in so much money, that they now have the cash to fund very sophisticated fraud rings against P&C carriers.

A lot of of P&C executives are unaware of these very large and powerful international fraud rings and their efforts to extract money from the carriers. They rely on their claims teams to flag potential fraud claims based on gut feelings and then use SIU to tackle those claims. Typically SIU teams do a great job, but without leads to the fraud rings, entire schemes can go undetected.

Here are some articles about the magnitude of the cash these fraud rings are accumulating.

California: $11 billion to $30 billion. $400 million went to prison gangs operating in the US. Even Scott Peterson is alleged to have been involved! Another study pegs the value at $10 billion.

Colorado: At least $10 million with no prosecutions.

Washington: At least $300 million.

Other: $242 million in Massachusetts, $200 million in Michigan, $18 million in Rhode Island, $8 million in Arizona and $6 million in Wisconsin. Nationwide estimate, $36 billion.

I was the president of a P&C carrier that was targeted by fraud. I learned the hard way how important leads are. If you would like to try our services, please visit us at fraudspotters.com

Top 2020 Fraud

This article describes some of the top insurance fraud schemes from 2020 that could have been detected by algorithm.

New Orleans Staged Truck Accidents

This fraud involved a very large group of criminals who purposely collided vehicles into semi-trailer trucks to sue for personal injuries. It is an amazing demonstration of the level of danger some fraudsters are willing to tolerate to earn a payday. One of the participants of the scheme decided to change sides and testify against the fraud ring. He was assassinated. Here is another account, another, and another.

Our algorithms work differently than most of our competitors. We have proprietary techniques that are looking for nonrandom behaviors. If our datasets contain enough of these fraudulent claims, we pick up the non-random effect and flag the claims for SIU teams to investigate. When the pattern is identified, we immediately flag subsequent claims fitting the profile so that the SIU team can begin surveillance.

Typically a ring like this, gets into a ‘rhythm’ of how they commit the fraud and it is these non-random patterns that our algorithms detect. These can involve a particular time of the week, location, set of doctors, set of lawyers, pattern of medical bills, etc. that are detectable to our algorithms. In this specific instance, the the location of the accidents is the key. The fraudsters chose a specific jurisdiction (New Orleans) where they felt that investigators and legal authorities would not catch them. Non-random factors such as the jurisdiction of these accidents are what our algorithms are designed to catch.

NFL Players Overbilled Insurance Companies

In this case, eight former NFL players were involved in submitting fake medical records. Ninety-two claims were submitted, asking for reimbursements totaling $723,826. You can read about it here.

This fraud ring is a great example of the type of fraud that our algorithms detect, as it highlights common behavior associated with medical bill fraud. While this case involved large sums of money with high profile “patients,” personal injury protection fraud schemes more commonly involve a very large number of smaller-value claims (each totaling $10,000 or less, for example).

What typically happens is a medical provider, working with “patients”, bills insurance companies for medical “treatment” that did not occur. In this particular former NFL case the “patients” were a poor choice for the fraud ring because they were high profile and high dollar and thus more likely to attract the attention of human medical reviewers.

What is less detectable, and more common, is for medical providers to recruit low profile individuals and submit small nuisance claims. Insurance company executives often believe that these small claims do not add up to large dollar amounts.

For a time, when I was the president of the 10th largest auto insurer in Florida, I believed these small claims were just a nuisance and a part of doing business, not realizing how frequent they were and how many dollars were siphoned off in this manner.

Fraudspotters has sophisticated algorithms that read all the medical bills, profiling the pattern of CPT codes used. Fraudsters often rely on the fact that insurance company executives are not willing to pay doctors to review every small nuisance claim. Indeed many executives feel they are making a good business decision to quickly settle these claims, not realizing that fraudsters may be behind hundreds of similar claims directed precisely at them! In fact, many insurance companies establish “fast track” payment methods whereby if certain claims meet the criteria, the claims bypass human review. Once a fraudster realizes which types of claims are fast tracked, they submit the same type of claim many times over. The individual reviewers who are just following “check box” routines for the fast track do not realize this is occurring, but our algorithms can. Unfortunately, insurance company executives often think they are following best practice by using services from our competitors who are “pinging” the patients against a database to see if they were involved in past fraud. If the “patients” are not detected in these fraud databases, and the claim is small and uncomplicated, the claim is fast tracked for review. Basically, all the fraud rings need to do is find people who are not already in insurance company fraud databases and fabricate small claims.

As I said above, this fraud ring was too high profile to avoid human inspection, but other rings are siphoning off billions of dollars undetected, essentially employing the playbook of this former NFL player fraud scheme but with lower profile claims.

Hudson “Runner”

Although not occurring in 2020, Luis Aguirre, the Hudson “Runner”, was sentenced in 2020. And, this situation serves to continue the narrative I established above with the NFL case about recurring medical fraud. This is exactly the type of scheme that insurance company executives do not realize is occurring and are in fact enabling with “fast track” payments for small claims where the claimant is not found in an insurance fraud database. This conspiracy took small claims which, while individually very small, added up to $3.5m.

I know for a fact that there are presently similar fraud schemes ongoing in South Florida. I know who is doing it and how to detect it. Ironically, there are even former Insurance SIU people who have created fake clinics to take advantage of the fast-track protocols.

Adjuster and Medical Provider Rack up $1.6m in Workers Compensation

In this scheme, an employee working for the company took advantage of the fast-track authority to continuously process small claims over a twelve year period, totaling $1.6m. This activity would have been very easy for our algorithms to detect. This is a classic scheme, and we have a specific algorithm exactly for this situation.

It is ironic that some insurance executives (such as I was myself) probably believe they are making good business decisions to fast track these claims, for a time I believed this too. But, what these executives need to do is run their claims through algorithms, such as ours, to screen for these patterns. This article is worth a read, and so is this one.

Other Relevant Fraud

If you would like a free evaluation for fraud, click here.

Dominion Effect FAQ

In an effort for transparency, I created this page to write about questions that are asked of me, and my response.

Didn’t Georgia already audit the machines?

They did some sort of audit. But, people dispute these audits are effective. For example, while these type of videos are making their way around the web, a significant portion of society will doubt the results of Georgia’s audit. My link is to a Trump tweet so you can see that Trump is pushing this himself, and this information is widely seen. I’m not trying to make a partisan statement by linking to Trump.

Here is a video of the same man succinctly explaining why he believes the Georgia recount was not valid.

Isn’t this just a partisan issue?

No. All sides believe in the importance of election integrity. For example, the NY Times, which is considered center or left, depending on your point of view, but not right, had this to say about Dominion in June of 2020.

Are there really enough machines in Wisconsin to have changed the outcome there?

If you go to verifiedvoting.org, and selection Dominion, 2020, Wisconsin, and download the data, you’ll see that they are saying 527 precincts, 640,215 registered voters are on Dominion machines. The state only has a 20k vote difference among Biden and Trump. And, in my paper, the Dominion effect was calculated on a county basis, not precinct basis. To the extent counties are split on which machine they used, then my paper is underestimating the Dominion effect: the effect is likely bigger on a precinct by precinct basis; I don’t have the data to go to that detail.

But to answer the question: yes, based on published, public information, there are enough machines to change the election in Wisconsin.

Why don’t you show results for 2012 and 2016?

I did a fair amount of analysis on those election cycles with mixed results. It is challenging to tease out the effects. For example, suppose Dominion is deployed in 2012. Does it increase vote counts for Democrats in 2012, 2016 and 2020? Does it make the counts go up an additional increment each year? Are there years where the Dominion Effect is not in effect, so to speak, and the votes regress to what they should have been? It becomes harder to form a clear hypothesis when we mix the years. The hypothesis I posit in the paper is clear. I only test counties that had no Dominion in 2008 (by excluding New York) versus counties in 2020 that either did or did not have Dominion by that time.

Is it possible that there are pre-trends that can explain this result?

As I mentioned in the main article. I find the multivariable weighted least squares to be the most convincing. The reason for this is because the factor for Dominion by itself, when not controlled, does have pre-trends.

However, when using a weighted multivariate analysis, there is no pre-trend associated with the Dominion counties.

If you wish to see the two spreadsheets associated with this analysis, click here and here. If you wish to access a CSV file, click here.

If you wish to see how the data was constructed, click here here .

Are you sure that “Dominion” isn’t just a proxy for “Democratic Voter”?

We can test for this. The easiest way to test is to remove the Dominion flag for the 657 counties that have Dominion, and replace it with a flag for the top 657 most Democratic counties. This is performing the test assuming that a voting machine company had been adopted by the 657 most Democratic counties. Let’s call this company “MachinesForDems”. The model says that the “MachinesForDems Effect” is not the same as the Dominion Effect. The coefficient for “MachinesForDems” in the weighted model is -0.22% (not positive) and the two p–values are 0.07% (traditional) and 68.04% (robust–not significant).

We can do further tests by creating another flag, “MachinesForReps” which is the top 657 most Republican leaning counties. We can put all three flags into the model, Dominion, MachinesForDems, and MachinesForReps. Interestingly the “MachinesForReps” IS significant, and can be used as a control variable, but it doesn’t affect the significance of Dominion. Dominion’s coefficient for the weighted model is 1.56% and its p-values are 0.00% (traditional) and 0.09% (robust).

If you would like to see how I created this data, click here.

If you would like to see a spreadsheet with this analysis, click here .

Have you really accounted for very large and very small counties?

In our model, we are already using these adjustments:

  • weighting by county size
  • a field called “RuralUrbanContinuumCode2013”

These should adjust for county size, but in effort to address concerns of readers, I ran the model with two new flags:

  • 657 counties with highest number of voters in 2008
  • 657 counties with lowest number of voters in 2008

The Dominion Effect is still 1.55% and the p-values are 0.00% (traditional) 0.09% (robust). These p-values are suggesting less than 1 in 1000 chance of randomly occurring.

To further address this, I ran an additional model which also includes a field for the population per square mile. This model produces identical results of Dominion Effect of 1.55% and a p-value of 0.00% and 0.09%.

What about race? Why don’t you adjust for that?

We can do that too. To the model in the prior FAQ, I add a flag for the 657 counties that have the highest percentage of white-non-hispanic and another flag for the highest black-non-hispanic residences. This is as if a voting company somehow either got assigned to the highest percentage white or highest percentage black counties.

The Dominion Effect becomes 1.56% and the p-values are 0.00% and 0.15%.

Why don’t you adjust for age?

We can do that too. To the model used in the above FAQs, we can add a flag for the 657 counties with the highest percentage of residents over age 65 at the year 2010. At this point, I’d like to show the results of this enormous weighted-least squares.

Multiple Linear Regression: Weighted Least Squares, Two types of P-values
Variable Coefficient P-value P-value Consistent
Intercept -7.51% 0.00% 0.00%
RuralUrbanContinuumCode2013 -0.31% 0.00% 0.28%
ManufacturingDependent2000 -2.71% 0.00% 0.00%
HiAmenity 0.23% 23.48% 65.86%
HiCreativeClass2000 5.29% 0.00% 0.00%
Low_Education_2015_update 2.27% 0.00% 0.00%
PopChangeRate1019 0.17% 1.77% 0.00%
Net_International_Migration
_Rate_2010_2019
0.14% 0.10% 55.07%
PopulationDensity 0.00% 88.32% 96.33%
LargePopCountiesTop657 2.06% 0.00% 0.00%
SmallPopCountiesTop657 0.25% 74.12% 56.44%
HighDemPerTop657 -0.60% 0.22% 23.64%
HighRepPerTop657 1.59% 0.00% 0.01%
OverAge65PercentTop657 -3.84% 0.00% 0.00%
WhiteNonHispanicTop657 -4.58% 0.00% 0.00%
BlackNonHispanicTop657 1.49% 0.00% 0.24%
Dominion 1.42% 0.00% 0.24%

This does slightly effect the Dominion Effect. It shrinks to 1.42% (a value that doesn’t change any conclusions in the main article) and the p-value remains significant at 0.00% (traditional) and 0.24% (robust).

If you would like to see an Excel workbook with this data and analysis, click here.

It should be noted, that if you run enough models, inevitable some will produce higher coefficients and some will produce lower coefficients, but the important fact here is that the coefficient remains in the 1.0 to 1.6% range discussed in the article and the p-value remains significant.

Why don’t you focus in changes in demographic trends over time?

We can do that too.

Multiple Linear Regression: Weighted Least Squares, Two types of P-values
Variable Coefficient P-value P-value Consistent
Intercept-6.6%0.00%0.00%
NetMigrationRate0010-0.1%0.00%0.00%
NetMigrationRate10190.3%0.00%0.00%
NaturalChangeRate00100.5%0.00%0.10%
NaturalChangeRate10190.3%0.10%11.20%
Immigration_Rate_2000_20100.2%0.00%10.10%
Net_International_Migration_Rate_2010_20190.3%0.00%32.00%
UnemployRate2007-UnEmployRate2019-1.0%0.00%0.00%
Dominion1.7%0.00%0.00%

What does the above model look like if you add basic demographic info?

Like this:

Multiple Linear Regression: Weighted Least Squares, Two types of P-values
Variable Coefficient P-value P-value Consistent
Intercept2.9%1.10%22.70%
NetMigrationRate0010-0.1%0.00%0.20%
NetMigrationRate10190.3%0.00%0.00%
NaturalChangeRate00100.7%0.00%0.00%
NaturalChangeRate1019-0.1%39.10%65.40%
Immigration_Rate_2000_20100.3%0.00%7.40%
Net_International_Migration_Rate_2010_20190.0%88.80%96.90%
UnemployRate2007-UnEmployRate2019-1.1%0.00%0.00%
PopDensity20100.0%0.00%25.70%
Under18Pct2010-0.3%0.00%0.80%
Age65AndOlderPct2010-0.3%0.00%0.00%
WhiteNonHispanicPct20100.0%39.40%71.20%
BlackNonHispanicPct20100.1%0.00%0.00%
Dominion1.5%0.00%0.40%

Why don’t you test other machines?

Honestly, I was tired of working on this project and did not have flags for the other machines. However, someone who read this blog obtained flags, and I added them to the data. To the model shown in the above FAQ, I tested the various machines. Note, this was several different models, each testing the machines one at a time. When I put all of the machines at the same time into the model I encounter fitting problems.

Here are the results. Note each line is from its own model run. Dominion is the only significant machine. The Dominion “other” line is for “Sequoia (Dominion)” and “Premier/Diebold (Dominion)” machines. These machines are the most significant. Note that “20” indicates that the machine was used in 2020. For 2008, any machine could have been used, and it is too complicated to account for each permutation.

Multiple Linear Regression: Weighted Least Squares, Two types of P-values
Variable Coefficient P-value P-value Consistent
Democracy.Live-200.1%56.8%80.7%
Dominion.other-203.0%0.0%0.0%
Dominion.Voting.Systems-201.5%0.0%0.3%
Election.Systems…Software-200.3%16.6%49.1%
Hart.InterCivic-200.0%98.4%99.2%
Other-20-0.1%74.6%89.3%

If you wish to obtain an Excel workbook which shows how the above results were calculated, along with the results for the prior two FAQs, click here.

What other variables should we evaluate?

I recently reran this model against about 100 demographic variables. As noted above, some produce larger coefficients and others produce smaller. One issue I run into is that if you add enough variables, at some point the demographic variables start to be too correlated with each other and we have problems with multicollinearity. From a practical point of view what is occurring is many variables are describing the same thing and are overfitting. The most easy example to explain is the situation of a very largely populated county. From a demographic point of view, these counties simultaneously have large groups of very educated people and also large groups of very uneducated people. They also have very high income people and very low income people. They also have a high percentage of very young people. So, if you put the variables of size of county, high education, low education, high income, low income, and very young people in the same models, these variables are competing with each other to flag the very large cities. This can cause overfit issues and multicollinearity issues which can render the models less reliable. I think the best thing for me to say about this is it appears that the Dominion Effect, if real is somewhere between 1.0% and 1.6%. The p-values are typically significant. Only a full forensic audit would reveal the true nature of the situation.

Are you sure the model isn’t just picking up state specific effects?

Someone suggested that I delete all states except for states that are split with some counties having Dominion and other counties in the state not using Dominion. These split states are: Arizona, California, Colorado, Florida, Illinois, Iowa, Kansas, Massachusetts, Michigan, Minnesota, Missouri, Nevada, New Jersey, Ohio, Pennsylvania, Tennessee, Virginia, Washington, and Wisconsin.

Using the original model, and only including these split states, the Dominion Effect in the weighted model is 1.54%. The power of the model is diminished slightly with so many fewer counties. The p-values for the weighted model are 0.00% (traditional) and 1.20% (robust).

This suggests that when we run the model only on split county states, the Dominion Effect remains.

If you would like to see an excel workbook with this analysis, click here .

If you would like to see how the voting size and partisan share fields are created, click here.

In your analysis, if you remove Georgia, the Dominion Effect is greatly diminished? Doesn’t this invalidate your work?

Not really. The results are still valid. They Dominion coefficient using unweighted values is 0.43% with p-values 14.63% (traditional) and 12.74% (robust) and for weighted values they are 0.96% with p-values of 0.00% (traditional) and 6.05% (robust). I believe this is indicating that Georgia is the strongest case for auditing. Because the unweighted coefficients become much lower without Georgia, to me this is saying that Georgia is a prime candidate for testing small counties; the Dominion effect is likely strong there. Because the coefficient stays relatively high for weighted, it seems to say that the “Dominion Effect” is stronger in big cities outside of Georgia but not as strong in small counties. Click here for an excel workbook showing results without Georgia.

I am doing heavy analysis and encountering modeling errors? Is there a problem with the data?

Yes. Many people have analyzed the data and one valid criticism has occurred. There are five data points that have incorrect demographic data. If you do heavy demographic modeling, I recommend you remove these five data points: “Baltimore city”, “Saint Louis”, “St. Louis city”, “Carson City”, “Dona Ana”.

Why are there older versions of this article on the web?

I wrote and updated this article by posting it on the web and sharing it with other statistical analysts. I did this prior to allowing it to be shared on social media. There were actually four different versions of this analysis. Version 3 had a data error in the denominator of the Y variable which over-emphasized the coefficients. A reviewer caught that, and it was fixed before this article was widely shared on social media. If you come across older cached versions of this article, just know they were pre-release drafts.

Furthermore, upon recently reviewing about 100 demographic variables, I think it is safer to say the Dominion Effect is somewhere between 1.0% and 1.6% instead of just saying 1.5%.

What are the biggest challenges to your model?

I have disclosed the biggest challenges in the FAQ above. I think the biggest issue is people wonder if I have proven causality. All I can say is, I am attempting to address any concerns people have about that in the FAQ above. It is very possible that some third factor is causing the results we are seeing, and it is not the Dominion machines. I have attempted to mitigate these concerns by controlling for other factors and disclosing pre-trend issues. My main point is, I’ve shown plausible data. Why resist audits that prove to the world there is nothing to see here?