ui suggestion: use quantiles of the effort distribution to pick the effort's color #10

Closed
opened 2023-09-20 04:24:52 +00:00 by orzo · 6 comments

The distribution of efforts follows a standard exponential and that means we know its quantiles/percentiles explicitly. I left a derivation of that claim together with a figure, data, and statistical tests here https://pastebin.com/EXXKm50D. This has already been shown somewhere before I'm sure but I can't be bothered to look for it.

Once we know that, we look at the 5th, 25th, 75th, and 95th percentiles of the standard exponential and they tell us that

  • 5% of the time the effort of a block or share will be between 0% and 5%,
  • 20% of the time it will be between 5% and 29%,
  • 50% of the time it will be between 29% and 139%. That's the bulk/middle 50% of the effort distribution where "typical" efforts live.
  • 20% of the time it will be between 139% and 300%,
  • the remaining 5% of the time it will be above 300%.

What I suggest is to use those to pick the color of efforts in the observe, something along the lines of light blue, green, neutral, yellow, red.

Better yet just straight up plug the effort f in the CDF 1 - exp(-f) and feed that to a diverging colormap. That would in effect be like doing histogram equalization.

The reasoning behind this suggestion is that the average effort (100%) is not an indicator of what's typical or not, and therefore of good and bad luck. Indicators based on quantiles are by construction better suited for that, especially in the case of asymmetric distribution like the exponential, because they are uniform in probability. Chopping the effort range arbitrarily with green for 0-100%, yellow for 100%-200%, and red for 200% and above, skews the expectation of beginners and funnels them to r/monero where they will start yet another "p2pool unlucky and bad?" thread after they get spooked by their second or third 200% share.

The distribution of efforts follows a standard exponential and that means we know its quantiles/percentiles explicitly. I left a derivation of that claim together with a figure, data, and statistical tests here https://pastebin.com/EXXKm50D. This has already been shown somewhere before I'm sure but I can't be bothered to look for it. Once we know that, we look at the 5th, 25th, 75th, and 95th percentiles of the standard exponential and they tell us that - 5% of the time the effort of a block or share will be between 0% and 5%, - 20% of the time it will be between 5% and 29%, - 50% of the time it will be between 29% and 139%. That's the bulk/middle 50% of the effort distribution where "typical" efforts live. - 20% of the time it will be between 139% and 300%, - the remaining 5% of the time it will be above 300%. What I suggest is to use those to pick the color of efforts in the observe, something along the lines of light blue, green, neutral, yellow, red. Better yet just straight up plug the effort `f` in the CDF `1 - exp(-f)` and feed that to a diverging colormap. That would in effect be like doing histogram equalization. The reasoning behind this suggestion is that the average effort (100%) is not an indicator of what's typical or not, and therefore of good and bad luck. Indicators based on quantiles are by construction better suited for that, especially in the case of asymmetric distribution like the exponential, because they are uniform in probability. Chopping the effort range arbitrarily with green for 0-100%, yellow for 100%-200%, and red for 200% and above, skews the expectation of beginners and funnels them to r/monero where they will start yet another "p2pool unlucky and bad?" thread after they get spooked by their second or third 200% share.
Author

Proof of concept. I'm using prob = 1 - exp(-effort / 100), rescaled from [0, 1] to [0.15, 0.8], and fed into the reversed RdYlBu color map. It fits well with the current muted color scheme and is easy on the eye. I never programmed in go but this is very simple and I could always give it a try and add an alternative to effort_color() in cmd/web/views/funcs.go. Feel free to close the issue, I'm happy to let it go as well.

Proof of concept. I'm using `prob = 1 - exp(-effort / 100)`, rescaled from [0, 1] to [0.15, 0.8], and fed into the reversed `RdYlBu` color map. It fits well with the current muted color scheme and is easy on the eye. I never programmed in go but this is very simple and I could always give it a try and add an alternative to `effort_color()` in `cmd/web/views/funcs.go`. Feel free to close the issue, I'm happy to let it go as well.
Owner

This looks neat! Maybe p2pool.io can also use the same coloration, currently p2pool.observer tries to match their color/values.

This looks neat! Maybe p2pool.io can also use the same coloration, currently p2pool.observer tries to match their color/values.
DataHoarder self-assigned this 2023-10-06 06:39:19 +00:00
Owner

For future reference, this is the contents of the linked pastebin for statistical tests:

Mining blocks/share can be approximated by a Poisson point process. The uniformity property of the hash function tells us that for a fixed difficulty diff the probability of finding a block in a single attempt p = 1/diff. Therefore the probability P(n) of finding a block after n failed attempts

P(n) = (1 - p)(1 - p)...(1 - p) p = (1-p)^n p.
       \________n times_______/

Let p = 1/diff = (n/diff)/n. By taking the limit n -> inf of a large number of attempts while keeping the ratio n/diff finite we find

P(n)dn = (1/diff) exp(-n/diff) dn

where the infinitesimal 1/n -> dn. The ratio n/diff is nothing but the effort, i.e. the number of attempted hashes before the next block/share is found in units of the difficulty. Let f = n/diff. With this change of variable we get the effort distribution

P(f)df = exp(-f)df,

i.e. the standard exponential. Figure 1 compares the empirical distribution of the efforts of the last 100 blocks on p2pool and p2pool mini (top row), and the last 49 shares of two randomly selected miners on p2pool and p2pool mini respectively (bottom row). All 4 empirical distributions pass the Kolmogorov-Smirnov test, i.e. fail to reject the null hypothesis that the sample of efforts could have come out of a standard exponential.

It follows that the CDF

P(f <= F) = integral exp(-f) over f from 0 to F
          = 1 - exp(-F).

and thus the quantiles

Q(x) = -log(1 - x).

Alternatively, we could focus on the change of variable n = ht involving the product of the hash rate h and t the inter-block time. With this change of variables we find the distribution of inter-block times

P(t)dt = (h/diff) exp(-(h/diff)t) dt,

namely the exponential distribution with rate h/diff.

Compared to the inter-block times distribution, the effort distribution has two favourable properties. It does not depend on the difficulty of the network and is therefore static over time, and it also does not depend on the local hash rate and is thus universal across pools and miners.

Some effort quantiles of interest are

Q(0.05) = 5%
Q(0.25) = 29%
Q(0.50) = 69%
Q(0.63) = 100%
Q(0.75) = 139%
Q(0.86) = 200%
Q(0.95) = 300%

These tell us that the odds below:above of an effort of 100% are roughly 2:1, and the odds below:above 200% are roughly 6:1. In other words mining blocks with efforts of 200% or more is not that atypical.

For future reference, this is the contents of the linked pastebin for statistical tests: > > Mining blocks/share can be approximated by a Poisson point process. The uniformity property of the hash function tells us that for a fixed difficulty `diff` the probability of finding a block in a single attempt `p = 1/diff`. Therefore the probability `P(n)` of finding a block after `n` failed attempts > ``` > P(n) = (1 - p)(1 - p)...(1 - p) p = (1-p)^n p. > \________n times_______/ > ``` > Let `p = 1/diff = (n/diff)/n`. By taking the limit `n -> inf` of a large number of attempts while keeping the ratio `n/diff` finite we find > ``` > P(n)dn = (1/diff) exp(-n/diff) dn > ``` > where the infinitesimal `1/n -> dn`. The ratio `n/diff` is nothing but the effort, i.e. the number of attempted hashes before the next block/share is found in units of the difficulty. Let `f = n/diff`. With this change of variable we get the effort distribution > ``` > P(f)df = exp(-f)df, > ``` > i.e. the standard exponential. [Figure 1](https://imgur.com/a/BCfwP41) compares the empirical distribution of the efforts of the last 100 blocks on p2pool and p2pool mini (top row), and the last 49 shares of two randomly selected miners on p2pool and p2pool mini respectively (bottom row). All 4 empirical distributions pass the Kolmogorov-Smirnov test, i.e. fail to reject the null hypothesis that the sample of efforts could have come out of a standard exponential. > > It follows that the CDF > ``` > P(f <= F) = integral exp(-f) over f from 0 to F > = 1 - exp(-F). > ``` > and thus the quantiles > ``` > Q(x) = -log(1 - x). > ``` > Alternatively, we could focus on the change of variable `n = ht` involving the product of the hash rate `h` and `t` the inter-block time. With this change of variables we find the distribution of inter-block times > ``` > P(t)dt = (h/diff) exp(-(h/diff)t) dt, > ``` > namely the exponential distribution with rate `h/diff`. > > Compared to the inter-block times distribution, the effort distribution has two favourable properties. It does not depend on the difficulty of the network and is therefore static over time, and it also does not depend on the local hash rate and is thus universal across pools and miners. > > Some effort quantiles of interest are > ``` > Q(0.05) = 5% > Q(0.25) = 29% > Q(0.50) = 69% > Q(0.63) = 100% > Q(0.75) = 139% > Q(0.86) = 200% > Q(0.95) = 300% > ``` > These tell us that the odds below:above of an effort of 100% are roughly 2:1, and the odds below:above 200% are roughly 6:1. In other words mining blocks with efforts of 200% or more is not that atypical. >
Owner

Used the same gradient and rescaling as tested. Will later do some further tests to see what works best!

Used the same gradient and rescaling as tested. Will later do some further tests to see what works best!
Owner

Implemented in live observers

Implemented in live observers
Author

Just noticed this was implemented! Nice!

Another thing that may be worth trying is to use the RdYlGn colormap instead of RdYlBu. It's basically identical and creates the same muted aesthetic, but uses green instead of blue. I'm partial to RdYlBu, but that could align a bit more with p2pool.io if you prefer that.

Just noticed this was implemented! Nice! Another thing that may be worth trying is to use the `RdYlGn` colormap instead of `RdYlBu`. It's basically identical and creates the same muted aesthetic, but uses green instead of blue. I'm partial to `RdYlBu`, but that could align a bit more with p2pool.io if you prefer that.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: P2Pool/consensus#10
No description provided.