Originally posted by Kento
Can it even work when most of the votes for the left would be done at the end of PA election day where they couldn't count any mail-in votes until the day of, unlike other states?
Yes. It is not a function of time. It is a function of the actual results. Optimally, you do it at the end of the vote-counting.
It is not a chronological analysis of voting. It is an "numeric frequency analysis" to see how often 1s, 2s, 3s, etc. show up in the data.
And if your analyzed data set is too small, the result is noisy (meaning, it will show all over the place). Similar to "normalized data sets", you need sufficient elements in the set to properly analyze them. You can do this at the precinct level because you have hundreds and thousands of them. You cannot and it should not be done at the county level: not enough counties to do a proper analysis.
And, remember, the analysis done on the leading digit for each element in the analyzed data set, not the entire "number."
If you're interested, here's how it works (skip if this is boring crap to you: no, not offended if you skip):
Large sets of data follow a numerical law called Benford's Law. So in an accounting ledger that has all transactions (purchases, sales, etc.), you'd see the number "1" show up the most frequently at around 30% of the time for LEADING NUMBERS (the first number in a value that shows up). Don't confuse the actual symbolic "1" for the actual value of 1. We are talking about the literal written base-10 symbol "1" showing up in accounting data.
..................
What's great about Benford's Law is it can apply to lots of other things like votes in large elections. If there is fraud or impropriety, measuring the votes against Benford's Law can show you which areas may have fraud and which may not. HOWEVER!!!!!! (pretend I'm a professor in class screaming at this point to emphasize how important it is...and I'm firmly tapping the chalkboard with chalk while pointing to crazy charts and graphs that illustrate Benford's Law), you need large sets of data to use Benford's Law to measure how close a data set is to Benford's Law.
.............
Remember, it is not measuring the data value. Meaning, you don't measure some raw number like "Biden got one thousand two hundred and fifty three more votes than Trump in Fulton County." You're measuring how often the number 1 shows up in the voting data, how often the number 2 shows up in the voting data, etc. But ONLY FOR LEADING NUMBERS: the first number that shows up in the value of each data element in the measured set.
...................
And Benford's Law shows us that in naturally occurring data sets, the number 1 shows up about 30.1% of the time, the number 2 shows up about 17.6% of the time, etc.
So pretend you have the following set of voting data:
County A votes for Biden: 4,898
County B votes for Biden: 9,548
County C votes for Biden: 8,990
County D votes for Biden: 798
..............
Running it through Benford's Law, you'd see that it doesn't follow that at all.
Looking at only the first number, you get:
Zero 1s
Zero 2s
Zero 3s
One 4s
Zero 5
Zero 6s
One 7
One 8
One 9s
................
Does that match Benford's Law? You know from Benford's Law that about 30% of the leading numbers should be "1s." But only a single one of the numbers in the data is a "1."
.............
In reality, that data set is far too small. But we have far more sets of voting data than just 4 to check Benford's Law against. Usually for "representative data", or a normalized data set, you need 30-60. Sometimes, it could be as many as 3,000 needed.
Let me know if that makes sense.