# How to find all structures with the most equidistribution for a specific $f$ and $A$?

Suppose $f:A\to\mathbb{R}$ and $A$ is countably infinite subset of the real numbers. We can enumerate $A$ as $\left\{a_n\right\}_{n=1}^{\infty}$ and set $t$ as a natural number but the average $$\lim\limits_{t\to\infty}\frac{f(a_1)+f(a_2)+\cdot\cdot\cdot+f(a_t)}{t}$$ can vary depending on the enumeration. In order, to reduce the number of values the average varies with, I did the following:

Split $A$ into a sequence of finite subsets $F_1,F_2,...$ such that $F_1\subset F_2\subset...$ and $\bigcup\limits_{n=1}^{\infty}F_n=A$. The sequence of subsets will be denoted as a "structure of $A$" and a natural example (such as for $A=\left\{\frac{1}{m}:m\in\mathbb{N}\right\}$) would be $$\left\{F_n\right\}=\left\{\frac{1}{m}:m\in\mathbb{N},m\le n\right\}$$ Following this, I want a structure that has an average and the most equidistribution as defined below.** Section 1: Definition Of Average From Structure **

There exists an average or $\text{avg}\left(\left\{F_n\right\},f\right)$, if for every arbitrarily small positive $\epsilon$ there exists a sufficiently large integer $N$ such for all $n\ge N$. $$\left|\frac{1}{|F_n|}\sum\limits_{x\in F_{n}}f(x)-\text{avg}\left(\left\{F_n\right\},f\right)\right|\le \epsilon$$

Depending on the structure we choose, we may get a different average. However, for an intuitive average, we want a structure with most equidistribution as shown below:**Section 2: Steps To Determining The Structure With The Most Equidistriubtion****1.** Arrange the values in $F_n$ from least to greatest and take the difference between consecutive elements. Call this $\Delta F_n$. (Note $\Delta F_n$ *could* be a multi-set.)**Ex 1.1:** If $A=\left\{\frac{1}{m}:m\in\mathbb{N}\right\}$ and $\left\{F_n\right\}=\left\{\frac{1}{m}:m\in\mathbb{N},m\le n\right\}$ then $F_4=\left\{1,1/2,1/3,1/4\right\}$, and $\Delta F_4=\left\{1-1/2,1/2-1/3,1/3-1/4\right\}=\left\{1/2,1/6,1/12\right\}$. (While this is not a multi-set, the example below is) **Ex 1.2:** If $A=\mathbb{Q}\cap[0,1]$ and $\left\{F_n\right\}=\left\{\frac{j}{k}:j,k\in\mathbb{N},0\le j\le k, k\le n\right\}$ then the elements of $F_4$, arranged from least to greatest is, $F_4=\left\{0,1/4,1/3,1/2,2/3,3/4,1\right\}$ and $\Delta F_4=\left\{1/4-0,1/3-1/4,1/2-1/3,2/3-1/2,3/4-2/3,1-3/4\right\}=$ $\left\{1/4,1/12,1/6,1/6,1/12,1/4\right\}$. Here the elements repeat making this a multiset. **2.** Divide $\Delta F_n$ by the sum of all its elements so we get a distribution where all the elements sum to 1. Call this $\Delta F_n/\sum\limits_{x\in\Delta F_n}x$ or

**"**

*the information probability of the structure*

**"**

**Ex 2.1:**From example 1.1 note $\sum\limits_{x\in\Delta F_3}x=1/2+1/6+1/12=3/4$ and $\Delta F_3/\sum\limits_{x\in\Delta F_3}x=4/3\cdot \Delta F_3=\left\{2/3,2/9,1/9\right\}$. Note the elements in this set sum to $1$ and can hence act as a probability distribution (even though the elements are not acutal probabilties)

**We now want to calculate the**

3.

3.

*deviation*of the information probability from being a discrete uniform distribution to find the structure with the most equidistribution. (The smaller the

*deviation*the greater the structure's equidistribution.). Since elements of the multiset $\Delta F_n/\sum\limits_{x\in\Delta F_n}x$ always sum to $1$ we can calculate the

*deviation*using Entropy (from Information Theory). This can be written as

$$E(F_n)=-\sum\limits_{j\in\Delta F_n/\sum\limits_{x\in \Delta F_n}x}j\log j$$

**Ex 3.1:**From example 2.1, $E(F_3)$ is the same as $-\sum\limits_{j\in\left\{2/3,2/9,1/9\right\}}j\log j=-\left(2/3\log \left(2/3\right)+2/9\log \left(2/9\right)+1/9\log\left( 1/9\right)\right)\approx .369$

The structure whose information probability gives the highest entropy has the greatest equidistribution. However, with this comes problems.

**1.**In order for an average to exist, some structures add more than one element as $n$ increases by one. This means when comparing different structures with a defined average, it is hard to trust which structure has the most equidistribution unless the structures have the same number of elements.

In order to fix this suppose $\mathbb{S}(A)$ represents all structures of $A$ with a defined average from function $f$ (see

**Section 1**).

If $s,n,j\in\mathbb{N}$, I wish to find all $\left\{G_s\right\}\in \mathbb{S}(A)$ where (for each one):

$\forall\left(\left\{F_n\right\}\in\mathbb{S}(A)\right)\exists\left(s\in\mathbb{N}\right):\forall\left(j\ge s\right)\exists\left(n\in\mathbb{N}\right)\left(E(F_n)\le E(G_j)\Rightarrow|G_j|\le|F_n|\right)$.

We will call the statement above this oneNote structures $G_j$ and $F_n$ are equivelant when for $\left(\forall n\right)\left(\forall j\right) \left(\bigg[ \left|G_n\right|\le\left|F_j\right| \Rightarrow G_n\subseteq F_j \bigg] \lor \bigg[\left|F_j\right|\le\left|G_n\right|\Rightarrow F_j\subseteq G_n \bigg]\right)$

*Statement A*(not to be confused with set $A$).

Here is an example below:

Ex: I believe for $A=\left\{\frac{1}{m}:m\in\mathbb{N}\right\}$ and $$f(x)=\begin{cases} 2 & x\in\left\{\frac{1}{2m+1}:m\in\mathbb{N}\right\} \\ 1 & x\in\left\{\frac{1}{2m}:m\in\mathbb{N}\right\} \end{cases}$$ the only $G_s$ that satisfies

*Statement A*is $\left\{1/\left[2^s/m\right]:1\le m\le 2^s\right\}$ (where $\left[\cdot\right]$ is the nearest integer function) but I'm not sure how to prove this? (If I'm not correct, I need to adjust Statement $A$.)

**Questions**

1) For $A=\left\{\frac{1}{m}:m\in\mathbb{N}\right\}$ and the $f$ shown above, is the assumption that the only $G_s$ that satisfies

*Statement A*is $\left\{1/\left[2^s/m\right]:1\le m\le 2^s\right\}$, correct? If not, how do we correct

*Statement A*so there is only one $G_s$?

2) For question 1), is there a simpler way to calculate the structure with the most equidistribution that gives the same result as my assumption for $G_s$ without using Entropy?

3) How do we determine $G_s$ for other types of $A$ and $f$?

- closed
- 415 views
- $100.00

### Related Questions

- Insurance question involving net premium
- Find Mean, Variance, and Distribution of Recursively Defined Sequence of Random Variables
- Describing Bayes Theorem succinctly in words
- Card riffle shuffling
- The derivation of the formula for variance for a Pareto Distribution
- Pdf/cdf Probability
- Need Upper Bound of an Integral
- [Real Analysis] Show that $B$ is countable.

When you say: "In order for an average to exist, some structures add more than one element as t increases by one", I believe you mean "as n increases by one". where n is the index for the sets in the structure.

@Jip333 You’re right. I made the edits.

Correct me if I am misunderstanding, but it appears that G_s is not a structure on A. e.g. G_1 = {1, 2}, and 2 is not an element of A.

@Pxsort I thought G1={1,1/2} since G1={1/[2/1],1/[2/2]} where [] is the nearest integer function.

Oh I see.

You seem to be implicitly assuming a lot of properties about A. Really, A cannot be an arbitrary countable set, since its elements are totally ordered and can be added / subtracted / multiplied / divided. Effectively, this requires that A be a countable subset of the real numbers.

@Pxsort I made edits

Why do you keep asking the same question? What is it that motivates you?

@AlessandroIcari I thought if my question is clearer I would get a different response. Now my question is more mathematical. In most cases such an average would not exist but I do believe for the rationals and nowhere dense countably infinite sets one can find a unique G_s that has the most equidistribution.

I don't think the problem with your question is that it lacks mathematical precision, I think the problem with it is that there is no satisfactory solution. And I think I already showed you enough counterexamples and cases in which your problem can't be solved. Sometimes maths is like this, we don't always get what we hope for.

@AlexandroIsacri Can you apply your counterexample to the mathematics in my post. Show that a unique G_s cannot exist.

@AlexandroIscari I take that back, it’s clear for cases such as n*(-1)^n for integer n that an average cannot exist. But for other functions/their domain (such as the example in my post) if one or more averages exist (depending on the domains structure) does their exist a unique G_s?

I think the answer is "no" unless you add a lot of hypotheses. But we've already had a similar discussion, here the problem is essentially the same, the differences are not relevant enough to change the answer.