FergM · RonanKMcGovern · Oct 16, 2023
diff --git a/blog/2023-10-14 Intuition for Shannon Entropy.md b/blog/2023-10-14 Intuition for Shannon Entropy.md
@@ -21,7 +21,7 @@ Definition of [Shannon Entropy](https://en.wikipedia.org/wiki/Entropy_(informati
 * **I will also propose alternative *summand functions*** which achieve the same outcome
 
 ### Alternative Definitions of the *summand function*:
-* We can define any generic function of p, say *f(p)* as long as the function is concave (negative second derivative)
+* We can define the summand function as the product of p with any generic function of p, say *f(p)*, as long as the function *f(p)* is concave (negative second derivative)
 * We can replace log(p) with any of the following and still get similar results
     * p
     * p^2 (or even p^n where n is any integer) **
@@ -32,27 +32,29 @@ Definition of [Shannon Entropy](https://en.wikipedia.org/wiki/Entropy_(informati
 
 *\*\* Might also work for looser conditions*
 
+Side-note: Interestingly entropy in processes is force times flux. flux is often itself proportional to force (e.g. heat transfer rates are proprtional to temperature differences), so you have an equation that looks like deltaT times f(deltaT), and f(deltaT) is monotonic increasing. So you end up with the same mathematical formulation as p f(p). I'm not sure if that's really related to probabilities of states, but probably is.
+
 ## 2. Behaviour of the summand expression
 This is the key part you need to understand.
 
 1. What you want to do is plot the *summand function*
 2. Then you'll see that you want a concave *summand function*
 
 Why?
-* Want the `p_i == p_j` etc. solution to have the maximum entropy
-* That is, the maximum entropy should happen when you assume you know nothing, therefore you cannot reasonably say that any index of p has a greater value than any other
+* Want the `p_i == p_j` etc. solution to have the maximum entropy, i.e. all states are equally likely.
+* That is, the maximum entropy should happen if there is equilibrium and you have no further information
 
 What next
-* You start applying constraints to the values of p you want to accept
-* This restricts p and then from here you can look and see what available solution has the max entropy
+* You start applying constraints to the values of p you want to accept [suggestion: consider whether these are constraints on the values of p, or, perhaps, aggregate observations (e.g. an average)]
+* This restricts p and then from here you can look and see what available solution has the max entropy [side-note: 2nd law of thermodynamics says that entropy in a closed system can only increase, this is the rationale why eventually equilibrium hits max entropy, which is when there is equipartition among states]
 * And within the available solutions you want to continue this logic of *"penalising discrimination"*
 
 Roughwork comment
 * I'm going to be hand wavy here for now
 
 Conclusion
 * Basically if your *summand function* is concave, then the average of two different points on the function is *"lower"* than the value of the function if you *"move"* the p_heads and p_tails values to equal oneanother
-* More generally this concavity uses *Jensen's Inequality* to prioritise solutions which keep p_i and p_j apart from eachother
+* More generally this concavity uses *Jensen's Inequality* to prioritise solutions which keep p_i and p_j apart from each other
 
 Application
 * So as an application, you add constraints which resemble the problem you want to solve