We consider the following formula for the mean of a discrete random variable:
$$\mu = \sum x P(x)$$
To understand why this formula makes sense, it can be helpful to understand what the mean tells us about a random variable. Colloquially, we also use the general term "average" to refer to the mean often, which is a good starting point.
When we calculate the average of a set of numbers, we add them up and divide by the total count. So if our set of numbers is $[1, 2, 3, 4]$, then the average is:
$$\frac{1 + 2 + 3 + 4}{4} = 2.5$$
In some sense this gives us the "typical" value of our set of numbers. When we have a set of numbers, we can always add them up, divide by the count, and get the average. Suppose we have a list of 10 numbers, 3 of them are $2$s and the remaining 7 are $4$s. We can write the average of this as:
$$\frac{2 + 2 + 2 + 4 + 4 + 4 + 4 + 4 + 4 + 4}{10} = 3.4$$
Another equivalent way to write this is:
$$\frac{3\times 2 + 7\times 4 }{10} = \frac{3}{10}\times 2 + \frac{7}{10}\times 4 = 3.4$$
Instead of adding each up number and then dividing, we can multiply $2$ by $3/10$ and $4$ by $7/10$. Each value gets multiplied how often it shows up in the list, it gets weighted by the frequency of its occurrence. This is called a weighted average. The mean of a discrete random variable is very closely related to this concept of a weighted average.
So let's turn our attention back to a random variable, $X$. One way to think of a random variable is that it generates random numbers. A simple one to think of is a 6-sided dice. Each roll of a 6 sided dice gives a random number from 1 to 6, each one equally likely. So for example, if we roll that dice 10 times, we might get something like $[2, 5, 3, 2, 4, 6, 4, 2, 6, 3]$ or something like $[3, 5, 5, 1, 3, 2, 2, 2, 3, 6]$.
The average of the first set of numbers is 3.7, the average of the second is 3.2. The mean is closely related to taking averages like this. Let's think about taking averages of longer and longer sequences of random numbers, so say 20 rolls of the dice: $[5, 6, 2, 3, 3, 4, 5, 6, 1, 4, 1, 4, 6, 3, 4, 2, 2, 1, 4, 3]$, which has an average of 3.45. If we continue this process to an infinite length sequence, the average of our numbers becomes the mean of the discrete variable.
Remember before when we were saying that the average can be found by doing a _weighted average_ where we add up each value times the frequency of it occurring in our list? We can do that same thing here. We need to add up $1$ through $6$ times their frequency in the list.
What is this frequency? It's simply the probability. In this problem, each dice face has a probability of $1/6$, so our average is:
$$\frac{1}{6} \times 1 + \frac{1}{6} \times 2 + \frac{1}{6} \times 3 + \frac{1}{6} \times 4 + \frac{1}{6} \times 5 + \frac{1}{6} \times 6 = 3.5$$
Close inspection will show that the expression above is exactly our original formula:
$$\mu = \sum x P(x)$$
**In summary, the mean of a discrete random variable is the same as the weighted average of values from that random variable. The weights are given by the probability of each outcomes. These facts yield the common sum form of the mean of a discrete random variables.**