Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
b08fba2
Add .dtas frameset signing and tests
rpguiteras May 19, 2026
fba9a28
Extract test helper programs into .ado files
rpguiteras May 20, 2026
bae4361
Add CLAUDE.md and notes/ to .gitignore
rpguiteras May 20, 2026
1fccdd7
Add frames/datasets/ -- example datasets for frameset do-files
rpguiteras May 20, 2026
a3a3575
Add frames/examples/ -- annotated frameset example do-files
rpguiteras May 20, 2026
39d07fb
Add frames/docs/ -- user-authored format and application source indexes
rpguiteras May 20, 2026
6bab84b
Add interactive_roundtrip_test.do -- 5-case round-trip test for .dtas…
rpguiteras May 20, 2026
e736d8c
additional tests of frames
rpguiteras May 21, 2026
a1da39b
improve explanatory comments in test do-files
rpguiteras May 21, 2026
c7f1def
Create README-tests.md
rpguiteras May 21, 2026
fe89471
Migrate legacy .dtas tests into frames/tests
rpguiteras May 21, 2026
b137d54
Update .gitignore
rpguiteras May 21, 2026
e98442b
Update .gitignore
rpguiteras May 22, 2026
3ca8a3f
update version in ado and regenerate help
rpguiteras May 22, 2026
ebf230b
more updating version numbers
rpguiteras May 22, 2026
a6f9a2e
Update CHANGELOG.md
rpguiteras May 22, 2026
7e2dac4
Add fallback for Stata < 18
rpguiteras May 22, 2026
58078c4
clarify assumed working directory and relative paths in test and exam…
rpguiteras May 22, 2026
15da081
Restore active frame after datasignature call
rpguiteras May 26, 2026
683c8f5
update version number to 3.1.0-alpha3
rpguiteras May 26, 2026
633047c
Add new frameset tests
rpguiteras May 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,7 @@ local/
.vscode/
docs/.buildinfo
*.stswp
CLAUDE.md
notes/
frames/session-notes.md
statacons.code-workspace
35 changes: 34 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,40 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Unreleased
## 3.1.0-alpha2
### Added
- pystatacons: New `frameset_signing` config option (`auto` / `enabled` / `disabled`) in
`config_project.ini` to control `.dtas` signing behaviour on Stata < 18.
Default `auto` falls back to standard MD5 checksums on Stata 16/17 (with a one-time
warning); `disabled` always uses MD5 for `.dtas` without running a Stata batch;
`enabled` raises a hard error if Stata < 18 is detected (useful to enforce Stata 18+
across a team). `.dta` signing is unaffected in all modes.
Note: switching between frameset-aware and MD5 signatures for `.dtas` nodes causes a
one-time full rebuild (`.sconsign.dblite` entries are incompatible between the two
formats).
- complete_datasignature: Version guard in `frameset_file()` branch raises a clear error
(`STATACONS_REQUIRES_STATA18`) on Stata < 18, enabling the Python-side MD5 fallback.
### Fixed
- pystatacons: Stata log content now included in exception message when Stata returns a
non-zero exit code, improving debuggability of batch errors.

## 3.1.0-alpha1
### Added
- complete_datasignature: New `frameset_file("file.dtas")` option to sign Stata frameset
(`.dtas`) files as first-class build artifacts. Generates a concatenated per-frame
signature in the form `frameA=sigA|frameB=sigB|...`.
- complete_datasignature: New `skip_char(globlist)` option to exclude characteristics
matching any pattern in a space-separated globlist (used to drop time-tainted metadata
such as `frlink_*` characteristics).
- pystatacons: `.dtas` framesets registered with SCons as signed nodes via `get_dtas_sign`,
giving them timestamp-independent signatures alongside `.dta` files.
- pystatacons: New `dev_helpers.py` module with `dev_adopath_prefix()` to support
editable-install development workflows via the `STATACONS_DEV_SRC` environment variable.
### Changed
- complete_datasignature: In interactive mode, in-memory frames are preserved across
frameset signing via a temporary `.dtas` round-trip; batch mode skips the round-trip.
- complete_datasignature: Metadata collection now handles empty datasets (frames with no
variables) without error.
### Fixed
- pystatacons: Fixed error with opening log files, #25.

Expand Down
12 changes: 12 additions & 0 deletions frames/datasets/_refresh_datasets.do
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// _refresh_datasets.do
// Downloads the correct webuse example datasets and saves them locally.
// All paths are relative to the assumed working directory: frames/datasets/
// Run from frames/datasets/ with:
// do _refresh_datasets.do (interactive)
// StataMP-64.exe -e do _refresh_datasets.do (batch)

foreach ds in persons txcounty discharge1 discharge2 family hsng {
webuse `ds', clear
save "`ds'.dta", replace
di "Saved `ds'.dta"
}
Binary file added frames/datasets/auto.dta
Binary file not shown.
Binary file added frames/datasets/auto16.dta
Binary file not shown.
Binary file added frames/datasets/auto2.dta
Binary file not shown.
Binary file added frames/datasets/census.dta
Binary file not shown.
Binary file added frames/datasets/discharge1.dta
Binary file not shown.
Binary file added frames/datasets/discharge2.dta
Binary file not shown.
Binary file added frames/datasets/family.dta
Binary file not shown.
Binary file added frames/datasets/hsng.dta
Binary file not shown.
Binary file added frames/datasets/persons.dta
Binary file not shown.
Binary file added frames/datasets/txcounty.dta
Binary file not shown.
222 changes: 222 additions & 0 deletions frames/docs/sources-applications.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,222 @@
---
header-includes:
- \usepackage{amsmath}
---

# Sources: Applied Use of Stata Frames

This document catalogues sources gathered on the practical use of Stata frames and `.dtas` framesets. All files are saved in this directory (`documentation/applications/`), organized into subfolders.

---

## Subfolder Structure

```
applications/
help-files/ Stata internal help files (.sthlp) + markdown conversions
datasets/ Example .dta datasets (Stata installation + Stata Press webuse)
official-docs/ PDF manuals for applied frame commands
blog-posts/ Stata blog posts with worked examples
examples/ Self-contained replicable .do files
```

---

## Stata Internal Help Files (`help-files/`)

All `.sthlp` files were copied verbatim from `C:\Program Files\StataNow19\ado\base\f\`. Each was converted to a clean markdown (`.md`) file in the same folder.

### `frames_intro.sthlp` / `frames_intro.md`
**Version:** 1.2.1, 05 Aug 2025

The primary practical guide to using frames. Covers all major use cases with worked
examples: multitasking, working with simultaneous datasets, simulation via `frame post`,
the preserve/restore performance benefit, and a full tutorial on every frames command.
Also covers ado and Mata programming patterns. Most valuable single source for
applied use.

### `frames.sthlp` / `frames.md`
**Version:** 1.2.1, 05 Aug 2025

Quick-reference index listing syntax for every frame-related command and function with
one-line descriptions and cross-references. Good starting point for looking up syntax.

### `frlink.sthlp` / `frlink.md`
**Version:** 1.1.1, 10 Jul 2024

Full syntax and examples for `frlink` (link frames via key variables). Documents
`1:1` and `m:1` linkages, the `frame()` option for different variable names across
frames, `frlink dir`, `frlink describe`, and `frlink rebuild`. Includes three detailed
examples: persons-counties (m:1), generational data with six simultaneous linkages,
and discharge data (1:1).

### `frget.sthlp` / `frget.md`
**Version:** 1.1.0, 06 Mar 2023

Syntax and options for `frget` -- copies variables from a linked frame to the current
frame. Documents `prefix()`, `suffix()`, `exclude()` options and stored results.

### `fralias.sthlp` / `fralias.md`
**Version:** 1.0.1, 15 Mar 2023

Syntax and examples for `fralias add` (Stata 18+) -- creates memory-efficient alias
variables that reference variables in linked frames without copying. Contrasts with
`frget`. Covers `fralias describe`.

### `frames_save.sthlp` / `frames_save.md`
**Version:** 1.1.0, 20 Mar 2025

Full option set for `frames save`: `frames()`, `replace`, `linked`, `relaxed`,
`complevel()`, `nolabel`, `orphans`, `emptyok`. Notes that `linked` recursively saves
all transitively linked frames.

### `frames_use.sthlp` / `frames_use.md`
**Version:** 1.1.0, 20 Mar 2025

Full option set for `frames use`: `frames()`, `clear`, `replace`. Notes on how
`clear` sets the working frame and how `replace` interacts with existing frames.

### `frames_describe.sthlp` / `frames_describe.md`
**Version:** 1.0.0, 21 Feb 2023

Two syntaxes (in-memory vs. `using filename`). Documents `simple`, `short`,
`fullnames`, `numbers` options and stored results including `r(changed)`.

### `frames_modify.sthlp` / `frames_modify.md`
**Version:** 1.0.1, 05 May 2025

Syntax for adding or dropping frames from a `.dtas` file on disk without loading the
full frameset into memory. Documents `add(framelist [, replace])` and `drop(framelist)`.

### `frame_post.sthlp` / `frame_post.md`
**Version:** 1.0.0, 14 Jun 2019

The `frame create newframename newvarlist` / `frame post framename (exp)...` pattern
for accumulating results from simulations. Notes that `tempname` should be used for
the frame name in programs. Allows `strL` (unlike `postfile`).

### `frame_put.sthlp` / `frame_put.md`
**Version:** 1.0.1, 13 Jan 2020

`frame put varlist [if] [in], into(newframename)` -- copies a subset of variables or
observations from the current frame to a new frame, leaving the current frame unchanged.

### Additional `.sthlp` files copied (not converted to `.md`)

The following were copied from the Stata installation for reference but are smaller
command pages fully covered by `frames_intro.md` and `frames.md`:

- `frame_change.sthlp`, `frame_copy.sthlp`, `frame_drop.sthlp`
- `frame_prefix.sthlp`, `frame_putlabel.sthlp`, `frame_rename.sthlp`
- `frames_dir.sthlp`, `frames_reset.sthlp`

---

## PDF Manuals (`official-docs/`)

Downloaded from `https://www.stata.com/manuals/`.

### `stata-frlink.pdf`
**URL:** https://www.stata.com/manuals/dfrlink.pdf

Full [D] frlink manual including Quick start and Remarks and examples sections not
present in the help file. Contains detailed worked examples with the `persons` and
`txcounty` datasets and the generational family linkage example.

### `stata-frget.pdf`
**URL:** https://www.stata.com/manuals/dfrget.pdf

Full [D] frget manual including the explanation of how `frget` handles underscore
variables and match variables.

### `stata-fralias.pdf`
**URL:** https://www.stata.com/manuals/dfralias.pdf

Full [D] fralias manual including Quick start and detailed remarks on how alias
variables differ from copies and their memory implications.

### `stata-frames-modify.pdf`
**URL:** https://www.stata.com/manuals/dframesmodify.pdf

Full [D] frames modify manual including Quick start and Remarks.

*Note: PDFs for `frames intro`, `frames save`, `frames use`, and `frames describe` were
downloaded during the format documentation phase and are in `documentation/format/`.*

---

## Datasets (`datasets/`)

### From Stata installation (`C:\Program Files\StataNow19\ado\base\`)

| File | Description | Used in |
|------|-------------|---------|
| `auto.dta` | 1978 automobile data (74 obs, 12 vars) | General examples; `dtas.sthlp` |
| `auto2.dta` | Automobile data with extra variables | `dtas.sthlp` format example |
| `auto16.dta` | Automobile data (Stata 16 format) | Format testing |
| `census.dta` | 1980 US census by state (50 obs) | `frames_save` and `frames_describe` examples |

### From Stata Press web server (`http://www.stata-press.com/data/r19/`)

| File | Description | Used in |
|------|-------------|---------|
| `persons.dta` | Person-level data with `countyid` | `frlink` m:1 example |
| `txcounty.dta` | Texas county-level data | `frlink` m:1 example |
| `family.dta` | Generational family data with parent IDs | `frlink` self-link example |
| `discharge1.dta` | Hospital discharge data, part 1 | `frlink` 1:1 example |
| `discharge2.dta` | Hospital discharge data, part 2 | `frlink` 1:1 example |
| `hsng.dta` | Housing cost data (50 obs) | `frames_save` and `frames_modify` examples |

---

## Blog Posts (`blog-posts/`)

### `stata-blog-fun-with-frames-2019.md`
**URL:** https://blog.stata.com/2019/09/06/fun-with-frames/
**Author:** Chuck Huber | **Date:** September 6, 2019

The Stata 16 launch blog post on frames. Demonstrates five applied scenarios: (1)
fitting models on multiple datasets and comparing estimates, (2) storing `margins`
output in a separate frame for a contour plot, (3) using `frval()` for inline
cross-frame calculations, (4) using `frget` to pull demographics into a longitudinal
dataset for mixed-effects modeling, and (5) opening 22 chromosome datasets
simultaneously in Stata/MP. Key practical insight: frames eliminate the
clear/load/run/save cycle when coordinating multiple datasets.

### `stata-blog-framesets-alias-2023.md`
**URL:** https://blog.stata.com/2023/09/12/from-datasets-to-framesets-and-alias-variables-data-management-advances-in-stata/
**Author:** Kreshna Gopal | **Date:** September 12, 2023

The Stata 18 blog post introducing framesets (`.dtas`) and alias variables. Covers the
full workflow: creating multiple frames, saving them with `frames save`, describing with
`frames describe using`, reloading with `frames use`, saving with the `linked` option,
and creating alias variables with `fralias add`. Includes a historical timeline of
Stata data management milestones from 1985 to 2023.

*Note: An earlier version was saved in `documentation/format/stata-blog-framesets-alias-2023.md`.*

---

## Example Do-files (`examples/`)

Self-contained replicable scripts demonstrating key workflows. Each script lists
required datasets at the top. Where `webuse` is used, the dataset is also available
in the `datasets/` folder.

### `example-01-basic-frames.do`
Basic frame management: `frame create`, `frame change`, `frame prefix`, `frame copy`,
`frame put`, `frame drop`, `frames reset`. Uses `sysuse auto` and `sysuse census`.

### `example-02-frlink-frget.do`
Linking frames with `frlink`: m:1 (persons-counties), 1:1 (discharge data), self-link
(generational family). Also demonstrates `frget`, `fralias add`, and `frval()`.
Uses `webuse persons`, `webuse txcounty`, `webuse family`, `webuse discharge1/2`.

### `example-03-simulation-frame-post.do`
Monte Carlo simulation using `frame create` / `frame post`. Runs 1,000 OLS replications
and collects results (slope estimate, SE, CI coverage) in a separate frame. Tests that
the 95% CI covers the true slope approximately 95% of the time. Uses no external datasets.

### `example-04-frameset-save-use.do`
Full frameset lifecycle: `frames save`, `frames describe using`, `frames use`,
`frames modify add`, `frames modify drop`. Uses `sysuse census` and `webuse hsng`.
Loading