@@ -8,12 +8,176 @@ layout: post
88
99## 2.1 Entropy
1010
11- long contents .....
11+ > ** 놀람도 (Surprisal)**
12+ > 어떤 사건이 발생했을 때 느끼는 놀라움의 정도.
13+
14+ $$
15+ S(x) = -\log_2 p_X(x)
16+ $$
17+
18+ - 사건이 드물수록 $p_X(x)$ 값이 작아져 놀람도가 커진다.
19+ - 단위는 ** bits** (밑이 2인 로그 사용).
20+
21+ ---
22+
23+ ** Example 2**
24+ $X$: 지진 발생 여부를 나타내는 indicator
25+
26+ - $X = 1$: 큰 지진 발생
27+ - $X = 0$: 지진 아님
28+
29+ 만약 $p_X(1)$이 매우 작다면, $X=1$은 매우 놀라운 사건이다.
30+
31+ ---
32+
33+ > ** 엔트로피 (Entropy)**
34+ > 확률변수 $X$의 평균 놀람도 (기댓값)
35+
36+ $$
37+ H(X) = E[-\log_2 p_X(X)]
38+ = \sum_{x \in X} p_X(x) \log_2 \frac{1}{p_X(x)}
39+ $$
40+
41+ ---
42+
43+ ** Example 4**
44+ Binary random variable, $p(1) = 0.01$, $p(0) = 0.99$
45+
46+ $$
47+ H(X) = 0.01\log_2\frac{1}{0.01} + 0.99\log_2\frac{1}{0.99}
48+ \approx 0.080793\ \text{bits}
49+ $$
50+
51+ ---
52+
53+ ** Example 5**
54+ Uniform distribution on a fair six-sided die $(p = 1/6)$
55+
56+ $$
57+ H(X) = \sum_{i=1}^{6} \frac{1}{6}\log_2\frac{1}{1/6}
58+ = \log_2 6
59+ \approx 2.584963\ \text{bits}
60+ $$
61+
62+ ---
63+
64+ ** Example 6**
65+ $X \sim \text{Bern}(1/2)$, $Y$ with $(0.99, 0.01)$
66+
67+ $$
68+ H(X) = -\frac12\log_2\frac12 - \frac12\log_2\frac12 = 1\ \text{bit}
69+ $$
70+
71+ $$
72+ H(Y) = -0.99\log_2(0.99) - 0.01\log_2(0.01) \approx 0.080793\ \text{bits}
73+ $$
74+
75+ $\Rightarrow\ H(X) > H(Y)$
76+
77+ ---
78+
79+ ** Example 7** (Guessing game)
80+
81+ Q1: $P(\text{Yes}) = 1/2$, Q2: $P(\text{Yes}) = 1/4$
82+
83+ $$
84+ H(\text{Q1}) = -\frac12\log_2\frac12 - \frac12\log_2\frac12 = 1\ \text{bit}
85+ $$
86+
87+ $$
88+ H(\text{Q2}) = -\frac14\log_2\frac14 - \frac34\log_2\frac34
89+ = \frac14 \cdot 2 + \frac34\log_2\frac{4}{3}
90+ \approx 0.811278\ \text{bits}
91+ $$
92+
93+ $\Rightarrow\ \text{Q1 is more informative}$
94+
95+ ---
96+
97+ ** Example 8**
98+
99+ $$
100+ D_1: \left(\frac19,\frac19,\frac19,\frac29,\frac29,\frac29\right), \quad
101+ D_2: \left(\frac12,\frac14,\frac18,\frac{1}{16},\frac{1}{32},\frac{1}{32}\right)
102+ $$
103+
104+ $$
105+ H(D_1) = 3\cdot\frac{1}{9}\log_2 9 + 3\cdot\frac{2}{9}\log_2\frac{9}{2}
106+ \approx 2.503258\ \text{bits}
107+ $$
108+
109+ $$
110+ H(D_2) = \frac12\cdot 1 + \frac14\cdot 2 + \frac18\cdot 3 + \frac{1}{16}\cdot 4 + 2\cdot\frac{1}{32}\cdot 5
111+ = 1.9375\ \text{bits}
112+ $$
113+
114+ $\Rightarrow\ H(D_1) > H(D_2)$
115+
116+ ---
117+
118+ ** Example 9**
119+
120+ $$
121+ x_1: (0.6,0.1,0.1,0.1,0.1), \quad x_2: (0.4,0.4,0.2,0,0)
122+ $$
123+
124+ $$
125+ H(x_1) = 0.6\log_2\frac{1}{0.6} + 4\cdot 0.1\log_2\frac{1}{0.1}
126+ \approx 1.770951\ \text{bits}
127+ $$
128+
129+ $$
130+ H(x_2) = 0.4\log_2\frac{1}{0.4} + 0.4\log_2\frac{1}{0.4} + 0.2\log_2\frac{1}{0.2}
131+ \approx 1.521928\ \text{bits}
132+ $$
133+
134+ $\Rightarrow\ H(x_1) > H(x_2)$ ( $x_1$ has more uncertainty )
135+
136+ ---
137+
138+ ** Example 10**
139+
140+ $$
141+ S_1: (0.6,0.1,0.1,0.1,0.1), \quad
142+ S_2: (0.4,0.4,0.2,0,0)
143+ $$
144+
145+ $$
146+ H(S_1) = 0.6\log_2\frac{1}{0.6} + 4\cdot 0.1\log_2\frac{1}{0.1}
147+ \approx 1.770951\ \text{bits}
148+ $$
149+
150+ $$
151+ H(S_2) = 0.4\log_2\frac{1}{0.4} + 0.4\log_2\frac{1}{0.4} + 0.2\log_2\frac{1}{0.2}
152+ \approx 1.521928\ \text{bits}
153+ $$
154+
155+ $\Rightarrow\ H(S_1) > H(S_2)$ ( $S_1$ is more explorative )
156+
157+ ---
158+
159+ ** Example 11**
160+
161+ Geometric distribution, $p = \frac12$, $x \ge 1$
162+ $p_X(x) = 2^{-x}$
163+
164+ $$
165+ H(X) = \sum_{x=1}^{\infty} 2^{-x} \log_2\frac{1}{2^{-x}}
166+ = \sum_{x=1}^{\infty} 2^{-x} \cdot x
167+ = \sum_{x=1}^{\infty} x\left(\frac12\right)^{x}
168+ $$
169+
170+ $$
171+ G(r) = \sum_{x=1}^{\infty} x r^{x} = \frac{r}{(1-r)^2}, \quad |r|<1
172+ $$
173+
174+ $$
175+ H(X) = G\!\left(\frac12\right)
176+ = \frac{\frac12}{\left(1-\frac12\right)^2}
177+ = 2\ \text{bits}
178+ $$
179+
12180
13- 1 . a
14- 2 . b
15- 3 . c
16- 4 . d
17181
18182## 2.2 Properties of Entropy
19183
0 commit comments