Skip to content

Commit 7fdc741

Browse files
zgk2003claude
andcommitted
Refactor study_013 evaluator to use OLS regressions and fix D8 coding
Major changes from code review: - Replace Pearson correlations with OLS regressions for F2-F5, matching the paper's Table 4 analysis (simple OLS for F2, multiple OLS with controls for F3-F5) - Fix F5 finding: change from overconfidence→OE to small_numbers→OE (β=1.17, t=1.91, p<.06) matching Model 2 in Table 4 - Move shared scoring logic (parse_question_responses, compute_construct_scores, code_belief_in_small_numbers, iter_response_records) into study_utils.py - Fix D8 open-ended coding: "no additional information needed" now correctly codes as +1 (substantive belief in small numbers), not treated as missing - Make D8 prompt optional with de-biased framing to avoid inflating statistical-reasoning responses - Tighten demographics to Table 2 only (sex, race, education, business_size) - Fix compute_participant_scores → compute_construct_scores naming mismatch - Update ground_truth.json with reported_coefficient and reported_t_value fields for regression findings Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent e318fe5 commit 7fdc741

9 files changed

Lines changed: 514 additions & 348 deletions

File tree

studies/study_013/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
## Description
1010

11-
This study examines how cognitive biases affect entrepreneurs' opportunity evaluation under risky conditions. Using a survey of 77 founders of top SMEs in Singapore, the study measures four cognitive biases (overconfidence, illusion of control, belief in the law of small numbers, and planning fallacy) and tests how they influence risk perception and opportunity evaluation of a standardized business case vignette.
11+
This study examines how cognitive biases affect entrepreneurs' opportunity evaluation under risky conditions. Using a survey of 77 founders of top SMEs in Singapore, the study measures four cognitive biases (overconfidence, illusion of control, belief in the law of small numbers, and planning fallacy) and tests how they influence risk perception and opportunity evaluation of a standardized business case vignette. The benchmark implementation focuses on the paper's calibration test and its reported regression findings.
1212

1313
## Participants
1414

@@ -22,17 +22,17 @@ This study examines how cognitive biases affect entrepreneurs' opportunity evalu
2222
| Finding | Hypothesis | Human Result |
2323
|---------|-----------|--------------|
2424
| F1 | Entrepreneurs are overconfident (mean items outside 90% CI > 1) | Mean = 5.17, SD = 2.64 |
25-
| F2 | Risk perception negatively associated with opportunity evaluation | r = -.58, p < .01 |
26-
| F3 | Illusion of control negatively associated with risk perception | r = -.44, p < .01 |
27-
| F4 | Illusion of control positively associated with opportunity evaluation | r = .34, p < .01 |
28-
| F5 | Overconfidence positively associated with opportunity evaluation | r = .30, p < .05 |
25+
| F2 | Risk perception negatively predicts opportunity evaluation (H1) | beta = -0.50, t = -5.98, p < .001 |
26+
| F3 | Illusion of control negatively predicts risk perception in Model 1 (H5) | beta = -0.76, t = -3.34, p < .01 |
27+
| F4 | Illusion of control positively predicts opportunity evaluation in Model 2 | beta = 0.40, t = 2.23, p < .05 |
28+
| F5 | Belief in the law of small numbers positively predicts opportunity evaluation in Model 2 | beta = 1.17, t = 1.91, p < .06 |
2929

3030
## Questionnaire Structure
3131

3232
- **Section A:** 5 forced-choice gamble items (risk propensity)
3333
- **Section B:** 7 Likert items (2 filler, 2 planning fallacy, 3 illusion of control)
3434
- **Section C:** 10 confidence-interval estimation items (overconfidence)
35-
- **Section D:** Business case vignette + 4 risk perception items + 3 opportunity evaluation items + 1 open-ended item
35+
- **Section D:** Business case vignette + 4 risk perception items + 3 opportunity evaluation items + 1 optional open-ended item coded for belief in the law of small numbers
3636

3737
## File Structure
3838

studies/study_013/index.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
"Boon Chong Lim"
77
],
88
"year": 2002,
9-
"description": "This study examines how cognitive biases affect entrepreneurs' opportunity evaluation under risky conditions. Using a survey of 77 founders of top SMEs in Singapore, the study measures overconfidence, illusion of control, belief in the law of small numbers, and planning fallacy, and tests how these biases influence risk perception and opportunity evaluation of a standardized business vignette. Results show that illusion of control reduces risk perception, which in turn increases opportunity evaluation (full mediation), while belief in the law of small numbers directly increases opportunity evaluation. Entrepreneurs are found to be significantly overconfident in their knowledge calibration.",
9+
"description": "This study examines how cognitive biases affect entrepreneurs' opportunity evaluation under risky conditions. Using a survey of 77 founders of top SMEs in Singapore, the study measures overconfidence, illusion of control, belief in the law of small numbers, and planning fallacy, then relates those constructs to risk perception and opportunity evaluation for a standardized business vignette. The benchmark reproduces the paper's calibration test and its reported regression findings: entrepreneurs are overconfident, risk perception negatively predicts opportunity evaluation, illusion of control lowers risk perception and increases opportunity evaluation before mediation, and belief in the law of small numbers increases opportunity evaluation in the pre-mediation model.",
1010
"contributors": [
1111
{
1212
"name": "Guankai Zhai",

studies/study_013/scripts/config.py

Lines changed: 73 additions & 133 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,27 @@
1-
import json
2-
import re
31
import numpy as np
4-
from scipy import stats
5-
from pathlib import Path
6-
from typing import Dict, Any, List, Optional
72

83
import sys
94
sys.path.insert(0, str(__import__("pathlib").Path(__file__).resolve().parent))
10-
from study_utils import BaseStudyConfig, PromptBuilder
5+
from study_utils import BaseStudyConfig, PromptBuilder, compute_construct_scores, iter_response_records
116

127
import random
138

149

15-
# Demographics distributions from Table 2 for persona generation
16-
INDUSTRIES = [
17-
"Retail", "Manufacturing", "Wholesale", "Construction",
18-
"Transportation and Communication", "Financial", "Professional"
19-
]
20-
2110
AGE_DISTRIBUTION = [
2211
(range(30, 40), 0.222), # Less than 40
2312
(range(40, 61), 0.715), # 40 to 60
2413
(range(61, 70), 0.063), # More than 60
2514
]
2615

16+
SEX_OPTIONS = ["male", "female"]
17+
SEX_WEIGHTS = [0.97, 0.03]
18+
19+
RACE_OPTIONS = ["Chinese", "Indian", "Other"]
20+
RACE_WEIGHTS = [0.924, 0.045, 0.031]
21+
22+
EDUCATION_OPTIONS = ["secondary", "postsecondary", "primary/other"]
23+
EDUCATION_WEIGHTS = [0.061, 0.864, 0.075]
24+
2725
BUSINESS_SIZE_OPTIONS = [
2826
"Less than S$1m",
2927
"Between S$1m and S$25m",
@@ -45,29 +43,43 @@ def weighted_age_sample():
4543
return random.randint(40, 60)
4644

4745

46+
def weighted_choice(options, weights):
47+
"""Draw one option according to the reported sample proportions."""
48+
return random.choices(options, weights=weights, k=1)[0]
49+
50+
4851
class CustomPromptBuilder(PromptBuilder):
4952
"""Builds the full Keh, Foo & Lim (2002) questionnaire prompt."""
5053

5154
def build_trial_prompt(self, trial_metadata):
52-
profile = trial_metadata.get("profile", {})
55+
profile = trial_metadata.get("profile") or trial_metadata.get("participant_profile", {})
5356
items_a = trial_metadata.get("items_a", [])
5457
items_b = trial_metadata.get("items_b", [])
5558
items_c = trial_metadata.get("items_c", [])
5659
items_d = trial_metadata.get("items_d", [])
5760
vignette_text = trial_metadata.get("vignette_text", "")
5861

5962
lines = []
63+
optional_question_numbers = []
6064

6165
# --- Persona Introduction ---
62-
age = profile.get("age", 48)
63-
industry = profile.get("industry", "Manufacturing")
66+
age = profile.get("age", 47)
67+
sex = profile.get("sex", "male")
68+
race = profile.get("race", "Chinese")
69+
education = profile.get("education", "postsecondary")
6470
business_size = profile.get("business_size", "Between S$1m and S$25m")
65-
years_exp = profile.get("years_experience", 15)
6671
founder = profile.get("is_founder", True)
6772

6873
lines.append("You are participating in a research study on entrepreneurial decision-making.")
69-
lines.append(f"Imagine you are a {age}-year-old entrepreneur in Singapore who {'founded' if founder else 'acquired'} a {industry.lower()} business (annual revenue: {business_size}). You have {years_exp} years of business experience.")
70-
lines.append("Please answer all questions honestly based on your perspective as a business owner.\n")
74+
lines.append(
75+
"Answer as one of the Singapore SME founders/owners described in the original paper."
76+
)
77+
lines.append(
78+
f"Imagine you are a {age}-year-old {sex} entrepreneur in Singapore, "
79+
f"{race}, with {education} education, who {'founded' if founder else 'bought over'} "
80+
f"the business you run (annual revenue: {business_size})."
81+
)
82+
lines.append("Please answer all questions honestly from that participant's perspective.\n")
7183

7284
q_counter = 1
7385

@@ -133,8 +145,11 @@ def build_trial_prompt(self, trial_metadata):
133145
q_counter += 1
134146
elif item["type"] == "open_ended":
135147
lines.append(f"Q{q_counter}: {item['question']}")
136-
lines.append(f" (Answer Q{q_counter}=<your brief response>)\n")
148+
lines.append(" Focus on the issues that actually drive your judgment from the case as written.")
149+
lines.append(" Mention extra information only if you genuinely need it.")
150+
lines.append(f" (Optional. Answer Q{q_counter}=<brief response>, write Q{q_counter}=No additional information needed, or omit Q{q_counter} to skip.)\n")
137151
item["q_idx"] = q_counter
152+
optional_question_numbers.append(q_counter)
138153
q_counter += 1
139154

140155
# --- Response format ---
@@ -143,7 +158,17 @@ def build_trial_prompt(self, trial_metadata):
143158
lines.append("=" * 60)
144159
lines.append("Output ONLY answer lines in the format: Qk=<value>")
145160
lines.append("One answer per line. Do not include explanations.")
146-
lines.append(f"Expected number of answer lines: {q_counter - 1}")
161+
if optional_question_numbers:
162+
optional_labels = ", ".join(f"Q{idx}" for idx in optional_question_numbers)
163+
required_answers = (q_counter - 1) - len(optional_question_numbers)
164+
lines.append(f"All numbered items except {optional_labels} are required.")
165+
lines.append(
166+
f"For {optional_labels}, respond with the issues influencing your judgment, "
167+
"or state that no additional information is needed."
168+
)
169+
lines.append(f"Expected number of answer lines: {required_answers} to {q_counter - 1}")
170+
else:
171+
lines.append(f"Expected number of answer lines: {q_counter - 1}")
147172

148173
return "\n".join(lines)
149174

@@ -168,18 +193,20 @@ def create_trials(self, n_trials=None):
168193

169194
trials = []
170195
for i in range(n):
171-
# Generate randomized entrepreneur profile from Table 2 demographics
196+
# Generate entrepreneur profiles only from demographics reported in Table 2.
172197
age = weighted_age_sample()
173-
industry = random.choice(INDUSTRIES)
198+
sex = weighted_choice(SEX_OPTIONS, SEX_WEIGHTS)
199+
race = weighted_choice(RACE_OPTIONS, RACE_WEIGHTS)
200+
education = weighted_choice(EDUCATION_OPTIONS, EDUCATION_WEIGHTS)
174201
business_size = random.choices(BUSINESS_SIZE_OPTIONS, weights=BUSINESS_SIZE_WEIGHTS, k=1)[0]
175-
years_exp = max(3, age - random.randint(22, 30))
176202
is_founder = random.random() < 0.79
177203

178204
profile = {
179205
"age": age,
180-
"industry": industry,
206+
"sex": sex,
207+
"race": race,
208+
"education": education,
181209
"business_size": business_size,
182-
"years_experience": years_exp,
183210
"is_founder": is_founder,
184211
}
185212

@@ -205,124 +232,37 @@ def aggregate_results(self, raw_results):
205232
"""Parse Qk=value responses and compute per-participant construct scores."""
206233
participants = []
207234

208-
for record in raw_results.get("individual_data", []):
209-
trial_info = record.get("trial_info", {})
210-
response_text = record.get("response_text", "")
211-
212-
# Parse Qk=Value
213-
responses = {}
214-
for line in response_text.split("\n"):
215-
match = re.match(r"Q(\d+)\s*[:=]\s*(.+)", line.strip())
216-
if match:
217-
q_num = int(match.group(1))
218-
responses[q_num] = match.group(2).strip()
219-
220-
items_a = trial_info.get("items_a", [])
221-
items_b = trial_info.get("items_b", [])
222-
items_c = trial_info.get("items_c", [])
223-
items_d = trial_info.get("items_d", [])
224-
225-
# --- Risk Propensity: count of risky choices (0-5) ---
226-
risk_propensity = 0
227-
for item in items_a:
228-
q_idx = item.get("q_idx")
229-
if q_idx and q_idx in responses:
230-
choice = responses[q_idx].strip().lower()
231-
risky = item.get("metadata", {}).get("risky_option", "a")
232-
if choice == risky:
233-
risk_propensity += 1
234-
235-
# --- Planning Fallacy: sum of B3 + B4 ---
236-
planning_fallacy = 0
237-
planning_count = 0
238-
for item in items_b:
239-
if item.get("metadata", {}).get("construct") == "planning_fallacy":
240-
q_idx = item.get("q_idx")
241-
if q_idx and q_idx in responses:
242-
val = self.extract_numeric(responses[q_idx])
243-
if 1 <= val <= 7:
244-
planning_fallacy += val
245-
planning_count += 1
246-
247-
# --- Illusion of Control: sum of B5 + B6 + B7 ---
248-
illusion_of_control = 0
249-
ioc_count = 0
250-
for item in items_b:
251-
if item.get("metadata", {}).get("construct") == "illusion_of_control":
252-
q_idx = item.get("q_idx")
253-
if q_idx and q_idx in responses:
254-
val = self.extract_numeric(responses[q_idx])
255-
if 1 <= val <= 7:
256-
illusion_of_control += val
257-
ioc_count += 1
258-
259-
# --- Overconfidence: count of items where correct answer is outside [lower, upper] ---
260-
overconfidence = 0
261-
oc_count = 0
262-
for item in items_c:
263-
q_lower = item.get("q_idx_lower")
264-
q_upper = item.get("q_idx_upper")
265-
correct = item.get("correct_answer")
266-
if q_lower and q_upper and correct is not None:
267-
if q_lower in responses and q_upper in responses:
268-
try:
269-
lower = float(responses[q_lower])
270-
upper = float(responses[q_upper])
271-
oc_count += 1
272-
if correct < lower or correct > upper:
273-
overconfidence += 1
274-
except (ValueError, TypeError):
275-
pass
276-
277-
# --- Risk Perception: sum of D1 + D2 + D3 + D4 ---
278-
risk_perception = 0
279-
rp_count = 0
280-
for item in items_d:
281-
if item.get("metadata", {}).get("construct") == "risk_perception":
282-
q_idx = item.get("q_idx")
283-
if q_idx and q_idx in responses:
284-
val = self.extract_numeric(responses[q_idx])
285-
if 1 <= val <= 7:
286-
risk_perception += val
287-
rp_count += 1
288-
289-
# --- Opportunity Evaluation: sum of D5 + D6 + D7 ---
290-
opportunity_evaluation = 0
291-
oe_count = 0
292-
for item in items_d:
293-
if item.get("metadata", {}).get("construct") == "opportunity_evaluation":
294-
q_idx = item.get("q_idx")
295-
if q_idx and q_idx in responses:
296-
val = self.extract_numeric(responses[q_idx])
297-
if 1 <= val <= 7:
298-
opportunity_evaluation += val
299-
oe_count += 1
300-
301-
# Only include participant if they have sufficient data
302-
if oc_count >= 5 and rp_count >= 3 and oe_count >= 2:
303-
participants.append({
304-
"risk_propensity": risk_propensity,
305-
"planning_fallacy": planning_fallacy,
306-
"illusion_of_control": illusion_of_control,
307-
"overconfidence": overconfidence,
308-
"risk_perception": risk_perception,
309-
"opportunity_evaluation": opportunity_evaluation,
310-
"profile": trial_info.get("profile", {}),
311-
})
235+
for record in iter_response_records(raw_results):
236+
participant_scores = compute_construct_scores(
237+
record.get("response_text", ""),
238+
record.get("trial_info", {}),
239+
)
240+
if participant_scores is not None:
241+
participants.append(participant_scores)
312242

313243
# Compute descriptive statistics
314244
if not participants:
315245
return {"participants": [], "descriptive_statistics": {}, "n_valid": 0}
316246

317-
constructs = ["risk_propensity", "planning_fallacy", "illusion_of_control",
318-
"overconfidence", "risk_perception", "opportunity_evaluation"]
247+
constructs = [
248+
"risk_propensity",
249+
"planning_fallacy",
250+
"illusion_of_control",
251+
"overconfidence",
252+
"risk_perception",
253+
"opportunity_evaluation",
254+
"small_numbers",
255+
"age",
256+
]
319257

320258
desc_stats = {}
321259
for c in constructs:
322-
values = [p[c] for p in participants]
260+
values = [p[c] for p in participants if p.get(c) is not None]
261+
if not values:
262+
continue
323263
desc_stats[c] = {
324264
"mean": float(np.mean(values)),
325-
"sd": float(np.std(values, ddof=1)),
265+
"sd": float(np.std(values, ddof=1)) if len(values) > 1 else 0.0,
326266
"n": len(values),
327267
}
328268

0 commit comments

Comments
 (0)