From be5a9604b753a39db23d54da71a13a288fb0d1eb Mon Sep 17 00:00:00 2001 From: Bion Howard Date: Tue, 9 Jun 2026 15:47:44 -0400 Subject: [PATCH 1/8] ci: path-filter Python workflows; add Rust workflow Python CI/CD (unix, microsoft) now skips runs when only Rust-port files change (crates/, Cargo manifests, docs/, Justfile). New rust.yml runs fmt + clippy -D warnings + the full test suite (including golden parity) on Linux/macOS/Windows with autocrlf disabled so byte-level goldens hold. Co-Authored-By: Claude Fable 5 --- .github/workflows/microsoft.yml | 14 +++++++++++ .github/workflows/rust.yml | 44 +++++++++++++++++++++++++++++++++ .github/workflows/unix.yml | 14 +++++++++++ 3 files changed, 72 insertions(+) create mode 100644 .github/workflows/rust.yml diff --git a/.github/workflows/microsoft.yml b/.github/workflows/microsoft.yml index 378e766..731a22a 100644 --- a/.github/workflows/microsoft.yml +++ b/.github/workflows/microsoft.yml @@ -3,8 +3,22 @@ name: Microsoft on: push: branches: [ main ] + paths-ignore: + - 'crates/**' + - 'Cargo.toml' + - 'Cargo.lock' + - 'Justfile' + - 'docs/**' + - '.github/workflows/rust.yml' pull_request: branches: [ main ] + paths-ignore: + - 'crates/**' + - 'Cargo.toml' + - 'Cargo.lock' + - 'Justfile' + - 'docs/**' + - '.github/workflows/rust.yml' jobs: build: diff --git a/.github/workflows/rust.yml b/.github/workflows/rust.yml new file mode 100644 index 0000000..1368f6a --- /dev/null +++ b/.github/workflows/rust.yml @@ -0,0 +1,44 @@ +name: Rust + +on: + push: + branches: [ main, rust-port ] + paths: + - 'crates/**' + - 'Cargo.toml' + - 'Cargo.lock' + - 'tests/golden/**' + - 'tests/more_languages/**' + - 'tests/path_to_test/**' + - '.github/workflows/rust.yml' + pull_request: + branches: [ main ] + paths: + - 'crates/**' + - 'Cargo.toml' + - 'Cargo.lock' + - 'tests/golden/**' + - 'tests/more_languages/**' + - 'tests/path_to_test/**' + - '.github/workflows/rust.yml' + +jobs: + test: + runs-on: ${{ matrix.os }} + strategy: + fail-fast: false + matrix: + os: [ubuntu-latest, macos-latest, windows-latest] + steps: + # golden parity tests compare bytes; CRLF translation would break them + - name: Disable git autocrlf + run: git config --global core.autocrlf false + - uses: actions/checkout@v4 + - uses: dtolnay/rust-toolchain@stable + - uses: Swatinem/rust-cache@v2 + - name: Format + run: cargo fmt --all -- --check + - name: Clippy + run: cargo clippy --all-targets --all-features -- -D warnings + - name: Test + run: cargo test --workspace --all-features diff --git a/.github/workflows/unix.yml b/.github/workflows/unix.yml index e919220..d831be4 100644 --- a/.github/workflows/unix.yml +++ b/.github/workflows/unix.yml @@ -3,8 +3,22 @@ name: Linux & MacOS on: push: branches: [ main ] + paths-ignore: + - 'crates/**' + - 'Cargo.toml' + - 'Cargo.lock' + - 'Justfile' + - 'docs/**' + - '.github/workflows/rust.yml' pull_request: branches: [ main ] + paths-ignore: + - 'crates/**' + - 'Cargo.toml' + - 'Cargo.lock' + - 'Justfile' + - 'docs/**' + - '.github/workflows/rust.yml' jobs: # CI From 3535e28bb554d7a9f690cf99f0e326edb544fdaf Mon Sep 17 00:00:00 2001 From: Bion Howard Date: Tue, 9 Jun 2026 15:48:02 -0400 Subject: [PATCH 2/8] test: add legacy golden harness and captured goldens generate_legacy_goldens.py captures the Python implementation's outputs: per-fixture components, token/line counts, full tree renders, and v1-scope tree renders (deferred-language extractors stubbed). These are the behavioral contract for the Rust port's parity suite. diff_components.py is the dev-loop diff tool. Co-Authored-By: Claude Fable 5 --- tests/golden/diff_components.py | 59 + tests/golden/generate_legacy_goldens.py | 156 ++ .../tests__dot_dot__my_test_file.py.json | 6 + ...tests__dot_dot__nested_dir__.env.test.json | 6 + ...ests__dot_dot__nested_dir__pytest.ini.json | 4 + ...ot_dot__nested_dir__test_tp_dotdot.py.json | 7 + ...nguages__group1__CUSTOMER-INVOICE.CBL.json | 40 + ...more_languages__group1__JavaTest.java.json | 29 + ..._more_languages__group1__JuliaTest.jl.json | 17 + ...more_languages__group1__KotlinTest.kt.json | 52 + ...__more_languages__group1__LuaTest.lua.json | 8 + ...e_languages__group1__ObjectiveCTest.m.json | 10 + ..._more_languages__group1__OcamlTest.ml.json | 9 + ..._more_languages__group1__addamt.cobol.json | 20 + ...s__more_languages__group1__lesson.cbl.json | 45 + ...ests__more_languages__group1__test.js.json | 40 + ...ests__more_languages__group1__test.ts.json | 41 + ...__more_languages__group2__PerlTest.pl.json | 9 + ...__more_languages__group2__PhpTest.php.json | 10 + ...languages__group2__PowershellTest.ps1.json | 27 + ...re_languages__group2__ScalaTest.scala.json | 16 + ..._more_languages__group2__apl_test.apl.json | 8 + ...sts__more_languages__group2__c_test.c.json | 39 + ...s__more_languages__group2__go_test.go.json | 12 + ...sts__more_languages__group2__test.csv.json | 10 + ..._more_languages__group3__bash_test.sh.json | 11 + ..._more_languages__group3__cpp_test.cpp.json | 74 + ...ore_languages__group3__csharp_test.cs.json | 48 + ..._languages__group3__hallucination.tex.json | 30 + ..._more_languages__group3__ruby_test.rb.json | 14 + ...e_languages__group3__swift_test.swift.json | 23 + ...s__more_languages__group3__test.capnp.json | 23 + ..._more_languages__group3__test.graphql.json | 19 + ...ts__more_languages__group3__test.lean.json | 15 + ...s__more_languages__group3__test.proto.json | 23 + ...__more_languages__group3__test.sqlite.json | 13 + ...re_languages__group3__test_Cargo.toml.json | 12 + ...uages__group3__test_json_rpc_2_0.json.json | 11 + ..._languages__group3__test_openapi.yaml.json | 16 + ..._languages__group3__test_openrpc.json.json | 14 + ...anguages__group3__test_pyproject.toml.json | 16 + ...ests__more_languages__group4__RTest.R.json | 9 + ..._more_languages__group4__erl_test.erl.json | 20 + ...re_languages__group4__haskell_test.hs.json | 8 + ...anguages__group4__mathematica_test.nb.json | 8 + ...more_languages__group4__matlab_test.m.json | 7 + ..._more_languages__group4__rust_test.rs.json | 50 + ...sts__more_languages__group4__test.zig.json | 11 + ...ore_languages__group4__test_fsharp.fs.json | 11 + ...re_languages__group4__test_tcl_tk.tcl.json | 8 + ...s__more_languages__group4__tf_test.tf.json | 12 + ...sts__more_languages__group5__Makefile.json | 14 + ...e_languages__group5__ansible_test.yml.json | 8 + ...guages__group5__app-routing.module.ts.json | 7 + ...guages__group5__app.component.spec.ts.json | 10 + ...e_languages__group5__app.component.ts.json | 11 + ...more_languages__group5__app.module.ts.json | 7 + ...e_languages__group5__checkbox_test.md.json | 23 + ..._languages__group5__checkbox_test.txt.json | 13 + ...anguages__group5__environment.test.ts.json | 10 + ...re_languages__group5__hello_world.pyi.json | 7 + ...more_languages__group5__k8s_test.yaml.json | 8 + ...guages__group5__requirements_test.txt.json | 14 + ..._languages__group5__rust_todo_test.rs.json | 13 + ..._more_languages__group5__sql_test.sql.json | 27 + ...roup5__standard-app-routing.module.ts.json | 6 + ...sts__more_languages__group5__test.env.json | 25 + ...anguages__group5__testJsonSchema.json.json | 9 + ...e_languages__group5__testPackage.json.json | 13 + ...nguages__group5__tickets.component.ts.json | 54 + ...up6__Microsoft.PowerShell_profile.ps1.json | 35 + ...ore_languages__group6__catastrophic.c.json | 196 ++ ...nguages__group6__cpp_examples_impl.cc.json | 7 + ...nguages__group6__cpp_examples_impl.cu.json | 7 + ...anguages__group6__cpp_examples_impl.h.json | 7 + ...more_languages__group6__edge_case.hpp.json | 4 + ...__more_languages__group6__fractal.thy.json | 30 + ...ages__group6__python_complex_class.py.json | 6 + ...guages__group6__ramda__cloneRegExp.js.json | 6 + ...more_languages__group6__ramda_prop.js.json | 9 + ...languages__group6__tensorflow_flags.h.json | 102 + ...tests__more_languages__group6__test.f.json | 11 + ...ts__more_languages__group6__torch.rst.json | 7 + ...ests__more_languages__group6__yc.html.json | 4 + ...anguages__group7__absurdly_huge.jsonl.json | 14 + ...re_languages__group7__angular_crud.ts.json | 18 + ..._more_languages__group7__structure.py.json | 35 + ...s__more_languages__group7__test.metal.json | 11 + ...ts__more_languages__group7__test.wgsl.json | 20 + ..._languages__group_lisp__LispTest.lisp.json | 7 + ...nguages__group_lisp__clojure_test.clj.json | 13 + ...guages__group_lisp__racket_struct.rkt.json | 6 + ...anguages__group_lisp__test_scheme.scm.json | 11 + ...guages__group_todo__AAPLShaders.metal.json | 30 + ...anguages__group_todo__crystal_test.cr.json | 4 + ...languages__group_todo__dart_test.dart.json | 4 + ...anguages__group_todo__elixir_test.exs.json | 4 + ...e_languages__group_todo__forward.frag.json | 4 + ...e_languages__group_todo__forward.vert.json | 4 + ...e_languages__group_todo__nodemon.json.json | 4 + ...e_languages__group_todo__sas_test.sas.json | 4 + ...nguages__group_todo__testTypings.d.ts.json | 4 + ...uages__group_todo__test_setup_py.test.json | 4 + ...e_languages__group_todo__vba_test.bas.json | 4 + ...languages__group_todo__wgsl_test.wgsl.json | 8 + ...s__path_to_test__class_method_type.py.json | 33 + .../tests__path_to_test__empty.py.json | 4 + .../tests__path_to_test__file.md.json | 6 + .../tests__path_to_test__file.py.json | 6 + .../tests__path_to_test__file.txt.json | 4 + .../tests__path_to_test__version.py.json | 6 + .../tests__dot_dot__my_test_file.py.json | 1 + ...tests__dot_dot__nested_dir__.env.test.json | 1 + ...ests__dot_dot__nested_dir__pytest.ini.json | 1 + ...ot_dot__nested_dir__test_tp_dotdot.py.json | 1 + ...nguages__group1__CUSTOMER-INVOICE.CBL.json | 1 + ...more_languages__group1__JavaTest.java.json | 1 + ..._more_languages__group1__JuliaTest.jl.json | 1 + ...more_languages__group1__KotlinTest.kt.json | 1 + ...__more_languages__group1__LuaTest.lua.json | 1 + ...e_languages__group1__ObjectiveCTest.m.json | 1 + ..._more_languages__group1__OcamlTest.ml.json | 1 + ..._more_languages__group1__addamt.cobol.json | 1 + ...s__more_languages__group1__lesson.cbl.json | 1 + ...ests__more_languages__group1__test.js.json | 1 + ...ests__more_languages__group1__test.ts.json | 1 + ...__more_languages__group2__PerlTest.pl.json | 1 + ...__more_languages__group2__PhpTest.php.json | 1 + ...languages__group2__PowershellTest.ps1.json | 1 + ...re_languages__group2__ScalaTest.scala.json | 1 + ..._more_languages__group2__apl_test.apl.json | 1 + ...sts__more_languages__group2__c_test.c.json | 1 + ...s__more_languages__group2__go_test.go.json | 1 + ...sts__more_languages__group2__test.csv.json | 1 + ..._more_languages__group3__bash_test.sh.json | 1 + ..._more_languages__group3__cpp_test.cpp.json | 1 + ...ore_languages__group3__csharp_test.cs.json | 1 + ..._languages__group3__hallucination.tex.json | 1 + ..._more_languages__group3__ruby_test.rb.json | 1 + ...e_languages__group3__swift_test.swift.json | 1 + ...s__more_languages__group3__test.capnp.json | 1 + ..._more_languages__group3__test.graphql.json | 1 + ...ts__more_languages__group3__test.lean.json | 1 + ...s__more_languages__group3__test.proto.json | 1 + ...__more_languages__group3__test.sqlite.json | 1 + ...re_languages__group3__test_Cargo.toml.json | 1 + ...uages__group3__test_json_rpc_2_0.json.json | 1 + ..._languages__group3__test_openapi.yaml.json | 1 + ..._languages__group3__test_openrpc.json.json | 1 + ...anguages__group3__test_pyproject.toml.json | 1 + ...ests__more_languages__group4__RTest.R.json | 1 + ..._more_languages__group4__erl_test.erl.json | 1 + ...re_languages__group4__haskell_test.hs.json | 1 + ...anguages__group4__mathematica_test.nb.json | 1 + ...more_languages__group4__matlab_test.m.json | 1 + ..._more_languages__group4__rust_test.rs.json | 1 + ...sts__more_languages__group4__test.zig.json | 1 + ...ore_languages__group4__test_fsharp.fs.json | 1 + ...re_languages__group4__test_tcl_tk.tcl.json | 1 + ...s__more_languages__group4__tf_test.tf.json | 1 + ...sts__more_languages__group5__Makefile.json | 1 + ...e_languages__group5__ansible_test.yml.json | 1 + ...guages__group5__app-routing.module.ts.json | 1 + ...guages__group5__app.component.spec.ts.json | 1 + ...e_languages__group5__app.component.ts.json | 1 + ...more_languages__group5__app.module.ts.json | 1 + ...e_languages__group5__checkbox_test.md.json | 1 + ..._languages__group5__checkbox_test.txt.json | 1 + ...anguages__group5__environment.test.ts.json | 1 + ...re_languages__group5__hello_world.pyi.json | 1 + ...more_languages__group5__k8s_test.yaml.json | 1 + ...guages__group5__requirements_test.txt.json | 1 + ..._languages__group5__rust_todo_test.rs.json | 1 + ..._more_languages__group5__sql_test.sql.json | 1 + ...roup5__standard-app-routing.module.ts.json | 1 + ...sts__more_languages__group5__test.env.json | 1 + ...anguages__group5__testJsonSchema.json.json | 1 + ...e_languages__group5__testPackage.json.json | 1 + ...nguages__group5__tickets.component.ts.json | 1 + ...up6__Microsoft.PowerShell_profile.ps1.json | 1 + ...ore_languages__group6__catastrophic.c.json | 1 + ...nguages__group6__cpp_examples_impl.cc.json | 1 + ...nguages__group6__cpp_examples_impl.cu.json | 1 + ...anguages__group6__cpp_examples_impl.h.json | 1 + ...more_languages__group6__edge_case.hpp.json | 1 + ...__more_languages__group6__fractal.thy.json | 1 + ...ages__group6__python_complex_class.py.json | 1 + ...guages__group6__ramda__cloneRegExp.js.json | 1 + ...more_languages__group6__ramda_prop.js.json | 1 + ...languages__group6__tensorflow_flags.h.json | 1 + ...tests__more_languages__group6__test.f.json | 1 + ...ts__more_languages__group6__torch.rst.json | 1 + ...ests__more_languages__group6__yc.html.json | 1 + ...anguages__group7__absurdly_huge.jsonl.json | 1 + ...re_languages__group7__angular_crud.ts.json | 1 + ..._more_languages__group7__structure.py.json | 1 + ...s__more_languages__group7__test.metal.json | 1 + ...ts__more_languages__group7__test.wgsl.json | 1 + ..._languages__group_lisp__LispTest.lisp.json | 1 + ...nguages__group_lisp__clojure_test.clj.json | 1 + ...guages__group_lisp__racket_struct.rkt.json | 1 + ...anguages__group_lisp__test_scheme.scm.json | 1 + ...guages__group_todo__AAPLShaders.metal.json | 1 + ...anguages__group_todo__crystal_test.cr.json | 1 + ...languages__group_todo__dart_test.dart.json | 1 + ...anguages__group_todo__elixir_test.exs.json | 1 + ...e_languages__group_todo__forward.frag.json | 1 + ...e_languages__group_todo__forward.vert.json | 1 + ...e_languages__group_todo__nodemon.json.json | 1 + ...e_languages__group_todo__sas_test.sas.json | 1 + ...nguages__group_todo__testTypings.d.ts.json | 1 + ...uages__group_todo__test_setup_py.test.json | 1 + ...e_languages__group_todo__vba_test.bas.json | 1 + ...languages__group_todo__wgsl_test.wgsl.json | 1 + ...s__path_to_test__class_method_type.py.json | 1 + .../counts/tests__path_to_test__empty.py.json | 1 + .../counts/tests__path_to_test__file.md.json | 1 + .../counts/tests__path_to_test__file.py.json | 1 + .../counts/tests__path_to_test__file.txt.json | 1 + .../tests__path_to_test__version.py.json | 1 + tests/golden/legacy/trees/dot_dot.txt | 10 + tests/golden/legacy/trees/more_languages.txt | 2225 +++++++++++++++++ .../legacy/trees/more_languages_group1.txt | 374 +++ .../legacy/trees/more_languages_group2.txt | 135 + .../legacy/trees/more_languages_group3.txt | 346 +++ .../legacy/trees/more_languages_group4.txt | 214 ++ .../legacy/trees/more_languages_group5.txt | 257 ++ .../legacy/trees/more_languages_group6.txt | 615 +++++ .../legacy/trees/more_languages_group7.txt | 126 + .../trees/more_languages_group_lisp.txt | 22 + .../trees/more_languages_group_todo.txt | 111 + tests/golden/legacy/trees/multi_seed.txt | 427 ++++ tests/golden/legacy/trees/path_to_test.txt | 51 + tests/golden/legacy/trees/repo_concise.txt | 699 ++++++ tests/golden/legacy/trees_v1/dot_dot.txt | 10 + .../golden/legacy/trees_v1/more_languages.txt | 1122 +++++++++ .../legacy/trees_v1/more_languages_group1.txt | 146 ++ .../legacy/trees_v1/more_languages_group2.txt | 54 + .../legacy/trees_v1/more_languages_group3.txt | 149 ++ .../legacy/trees_v1/more_languages_group4.txt | 123 + .../legacy/trees_v1/more_languages_group5.txt | 235 ++ .../legacy/trees_v1/more_languages_group6.txt | 300 +++ .../legacy/trees_v1/more_languages_group7.txt | 83 + .../trees_v1/more_languages_group_lisp.txt | 5 + .../trees_v1/more_languages_group_todo.txt | 13 + tests/golden/legacy/trees_v1/multi_seed.txt | 199 ++ tests/golden/legacy/trees_v1/path_to_test.txt | 51 + 247 files changed, 10379 insertions(+) create mode 100644 tests/golden/diff_components.py create mode 100644 tests/golden/generate_legacy_goldens.py create mode 100644 tests/golden/legacy/components/tests__dot_dot__my_test_file.py.json create mode 100644 tests/golden/legacy/components/tests__dot_dot__nested_dir__.env.test.json create mode 100644 tests/golden/legacy/components/tests__dot_dot__nested_dir__pytest.ini.json create mode 100644 tests/golden/legacy/components/tests__dot_dot__nested_dir__test_tp_dotdot.py.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__JavaTest.java.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__JuliaTest.jl.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__KotlinTest.kt.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__LuaTest.lua.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__ObjectiveCTest.m.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__OcamlTest.ml.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__addamt.cobol.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__lesson.cbl.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__test.js.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group1__test.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group2__PerlTest.pl.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group2__PhpTest.php.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group2__PowershellTest.ps1.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group2__ScalaTest.scala.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group2__apl_test.apl.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group2__c_test.c.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group2__go_test.go.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group2__test.csv.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__bash_test.sh.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__cpp_test.cpp.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__csharp_test.cs.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__hallucination.tex.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__ruby_test.rb.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__swift_test.swift.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test.capnp.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test.graphql.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test.lean.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test.proto.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test.sqlite.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test_Cargo.toml.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test_json_rpc_2_0.json.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test_openapi.yaml.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test_openrpc.json.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group3__test_pyproject.toml.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__RTest.R.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__erl_test.erl.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__haskell_test.hs.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__mathematica_test.nb.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__matlab_test.m.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__rust_test.rs.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__test.zig.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__test_fsharp.fs.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__test_tcl_tk.tcl.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group4__tf_test.tf.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__Makefile.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__ansible_test.yml.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__app-routing.module.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__app.component.spec.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__app.component.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__app.module.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__checkbox_test.md.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__checkbox_test.txt.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__environment.test.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__hello_world.pyi.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__k8s_test.yaml.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__requirements_test.txt.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__rust_todo_test.rs.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__sql_test.sql.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__standard-app-routing.module.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__test.env.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__testJsonSchema.json.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__testPackage.json.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group5__tickets.component.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__Microsoft.PowerShell_profile.ps1.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__catastrophic.c.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.cc.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.cu.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.h.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__edge_case.hpp.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__fractal.thy.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__python_complex_class.py.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__ramda__cloneRegExp.js.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__ramda_prop.js.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__tensorflow_flags.h.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__test.f.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__torch.rst.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group6__yc.html.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group7__absurdly_huge.jsonl.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group7__angular_crud.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group7__structure.py.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group7__test.metal.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group7__test.wgsl.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_lisp__LispTest.lisp.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_lisp__clojure_test.clj.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_lisp__racket_struct.rkt.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_lisp__test_scheme.scm.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__AAPLShaders.metal.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__crystal_test.cr.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__dart_test.dart.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__elixir_test.exs.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__forward.frag.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__forward.vert.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__nodemon.json.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__sas_test.sas.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__testTypings.d.ts.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__test_setup_py.test.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__vba_test.bas.json create mode 100644 tests/golden/legacy/components/tests__more_languages__group_todo__wgsl_test.wgsl.json create mode 100644 tests/golden/legacy/components/tests__path_to_test__class_method_type.py.json create mode 100644 tests/golden/legacy/components/tests__path_to_test__empty.py.json create mode 100644 tests/golden/legacy/components/tests__path_to_test__file.md.json create mode 100644 tests/golden/legacy/components/tests__path_to_test__file.py.json create mode 100644 tests/golden/legacy/components/tests__path_to_test__file.txt.json create mode 100644 tests/golden/legacy/components/tests__path_to_test__version.py.json create mode 100644 tests/golden/legacy/counts/tests__dot_dot__my_test_file.py.json create mode 100644 tests/golden/legacy/counts/tests__dot_dot__nested_dir__.env.test.json create mode 100644 tests/golden/legacy/counts/tests__dot_dot__nested_dir__pytest.ini.json create mode 100644 tests/golden/legacy/counts/tests__dot_dot__nested_dir__test_tp_dotdot.py.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__JavaTest.java.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__JuliaTest.jl.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__KotlinTest.kt.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__LuaTest.lua.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__ObjectiveCTest.m.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__OcamlTest.ml.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__addamt.cobol.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__lesson.cbl.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__test.js.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group1__test.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group2__PerlTest.pl.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group2__PhpTest.php.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group2__PowershellTest.ps1.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group2__ScalaTest.scala.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group2__apl_test.apl.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group2__c_test.c.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group2__go_test.go.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group2__test.csv.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__bash_test.sh.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__cpp_test.cpp.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__csharp_test.cs.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__hallucination.tex.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__ruby_test.rb.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__swift_test.swift.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test.capnp.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test.graphql.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test.lean.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test.proto.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test.sqlite.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test_Cargo.toml.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test_json_rpc_2_0.json.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test_openapi.yaml.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test_openrpc.json.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group3__test_pyproject.toml.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__RTest.R.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__erl_test.erl.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__haskell_test.hs.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__mathematica_test.nb.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__matlab_test.m.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__rust_test.rs.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__test.zig.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__test_fsharp.fs.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__test_tcl_tk.tcl.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group4__tf_test.tf.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__Makefile.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__ansible_test.yml.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__app-routing.module.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__app.component.spec.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__app.component.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__app.module.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__checkbox_test.md.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__checkbox_test.txt.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__environment.test.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__hello_world.pyi.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__k8s_test.yaml.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__requirements_test.txt.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__rust_todo_test.rs.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__sql_test.sql.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__standard-app-routing.module.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__test.env.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__testJsonSchema.json.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__testPackage.json.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group5__tickets.component.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__Microsoft.PowerShell_profile.ps1.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__catastrophic.c.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.cc.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.cu.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.h.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__edge_case.hpp.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__fractal.thy.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__python_complex_class.py.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__ramda__cloneRegExp.js.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__ramda_prop.js.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__tensorflow_flags.h.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__test.f.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__torch.rst.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group6__yc.html.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group7__absurdly_huge.jsonl.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group7__angular_crud.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group7__structure.py.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group7__test.metal.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group7__test.wgsl.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_lisp__LispTest.lisp.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_lisp__clojure_test.clj.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_lisp__racket_struct.rkt.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_lisp__test_scheme.scm.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__AAPLShaders.metal.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__crystal_test.cr.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__dart_test.dart.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__elixir_test.exs.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__forward.frag.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__forward.vert.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__nodemon.json.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__sas_test.sas.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__testTypings.d.ts.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__test_setup_py.test.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__vba_test.bas.json create mode 100644 tests/golden/legacy/counts/tests__more_languages__group_todo__wgsl_test.wgsl.json create mode 100644 tests/golden/legacy/counts/tests__path_to_test__class_method_type.py.json create mode 100644 tests/golden/legacy/counts/tests__path_to_test__empty.py.json create mode 100644 tests/golden/legacy/counts/tests__path_to_test__file.md.json create mode 100644 tests/golden/legacy/counts/tests__path_to_test__file.py.json create mode 100644 tests/golden/legacy/counts/tests__path_to_test__file.txt.json create mode 100644 tests/golden/legacy/counts/tests__path_to_test__version.py.json create mode 100644 tests/golden/legacy/trees/dot_dot.txt create mode 100644 tests/golden/legacy/trees/more_languages.txt create mode 100644 tests/golden/legacy/trees/more_languages_group1.txt create mode 100644 tests/golden/legacy/trees/more_languages_group2.txt create mode 100644 tests/golden/legacy/trees/more_languages_group3.txt create mode 100644 tests/golden/legacy/trees/more_languages_group4.txt create mode 100644 tests/golden/legacy/trees/more_languages_group5.txt create mode 100644 tests/golden/legacy/trees/more_languages_group6.txt create mode 100644 tests/golden/legacy/trees/more_languages_group7.txt create mode 100644 tests/golden/legacy/trees/more_languages_group_lisp.txt create mode 100644 tests/golden/legacy/trees/more_languages_group_todo.txt create mode 100644 tests/golden/legacy/trees/multi_seed.txt create mode 100644 tests/golden/legacy/trees/path_to_test.txt create mode 100644 tests/golden/legacy/trees/repo_concise.txt create mode 100644 tests/golden/legacy/trees_v1/dot_dot.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages_group1.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages_group2.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages_group3.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages_group4.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages_group5.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages_group6.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages_group7.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages_group_lisp.txt create mode 100644 tests/golden/legacy/trees_v1/more_languages_group_todo.txt create mode 100644 tests/golden/legacy/trees_v1/multi_seed.txt create mode 100644 tests/golden/legacy/trees_v1/path_to_test.txt diff --git a/tests/golden/diff_components.py b/tests/golden/diff_components.py new file mode 100644 index 0000000..ff12ac6 --- /dev/null +++ b/tests/golden/diff_components.py @@ -0,0 +1,59 @@ +# tests/golden/diff_components.py +"""Diff Rust extract output against legacy goldens for given fixture paths. + +Usage: python tests/golden/diff_components.py ... +Runs the Rust example binary and prints a unified diff per fixture. +""" +import difflib +import json +import subprocess +import sys +from pathlib import Path + +REPO = Path(__file__).resolve().parents[2] + + +def main() -> None: + paths = sys.argv[1:] + out = subprocess.run( + ["cargo", "run", "-q", "-p", "tree_plus_core", "--example", "extract", "--"] + + paths, + capture_output=True, + text=True, + cwd=REPO, + ) + if out.returncode != 0: + print(out.stderr) + sys.exit(1) + decoder = json.JSONDecoder() + rest = out.stdout.strip() + results = [] + while rest: + obj, idx = decoder.raw_decode(rest) + results.append(obj) + rest = rest[idx:].lstrip() + for result in results: + rel = result["path"] + golden_name = rel.replace("/", "__") + ".json" + golden = json.loads( + (REPO / "tests/golden/legacy/components" / golden_name).read_text() + ) + expected = golden["components"] + actual = result["components"] + if expected == actual: + print(f"OK {rel}") + continue + print(f"DIFF {rel} (expected {len(expected)}, actual {len(actual)})") + diff = difflib.unified_diff( + [json.dumps(c) for c in expected], + [json.dumps(c) for c in actual], + "legacy", + "rust", + lineterm="", + ) + for line in diff: + print(" ", line) + + +if __name__ == "__main__": + main() diff --git a/tests/golden/generate_legacy_goldens.py b/tests/golden/generate_legacy_goldens.py new file mode 100644 index 0000000..d50bd79 --- /dev/null +++ b/tests/golden/generate_legacy_goldens.py @@ -0,0 +1,156 @@ +# tests/golden/generate_legacy_goldens.py +"""Generate golden outputs from the legacy Python tree_plus implementation. + +Run from the repository root: + python tests/golden/generate_legacy_goldens.py + +Outputs land in tests/golden/legacy/: +- components/.json : parse_file() component lists per fixture +- trees/.txt : into_str() renders for directories/files +- counts/.json : count_tokens_lines() per fixture (wc tokenizer) +""" +from pathlib import Path +import json +import sys + +sys.path.insert(0, str(Path(__file__).resolve().parents[2])) + +import tree_plus_src as tree_plus # noqa: E402 +from tree_plus_src import engine # noqa: E402 + +REPO = Path(__file__).resolve().parents[2] +OUT = REPO / "tests" / "golden" / "legacy" + +FIXTURE_DIRS = [ + REPO / "tests" / "path_to_test", + REPO / "tests" / "more_languages", + REPO / "tests" / "dot_dot", +] + +TREE_TARGETS = { + "path_to_test": ("tests/path_to_test",), + "more_languages": ("tests/more_languages",), + "more_languages_group1": ("tests/more_languages/group1",), + "more_languages_group2": ("tests/more_languages/group2",), + "more_languages_group3": ("tests/more_languages/group3",), + "more_languages_group4": ("tests/more_languages/group4",), + "more_languages_group5": ("tests/more_languages/group5",), + "more_languages_group6": ("tests/more_languages/group6",), + "more_languages_group7": ("tests/more_languages/group7",), + "more_languages_group_lisp": ("tests/more_languages/group_lisp",), + "more_languages_group_todo": ("tests/more_languages/group_todo",), + "dot_dot": ("tests/dot_dot",), + "multi_seed": ("tests/path_to_test", "tests/more_languages/group1"), +} + + +def sanitize(p: Path) -> str: + rel = p.relative_to(REPO) + return str(rel).replace("/", "__") + + +DEFERRED_PARSERS = [ + "parse_php", "parse_kt", "parse_swift", "parse_go", "parse_bash", + "parse_ps1", "parse_zig", "parse_rb", "parse_sql", "parse_graphql", + "parse_cs", "parse_jl", "parse_scala", "parse_java", "parse_perl", + "parse_hs", "parse_fsharp", "parse_lisp", "parse_erl", "parse_capnp", + "parse_grpc", "parse_tex", "parse_lean", "parse_fortran", "parse_tf", + "parse_isabelle", "parse_lua", "parse_tcl", "parse_objective_c", + "parse_matlab", "parse_r", "parse_mathematica", "parse_ocaml", + "parse_cbl", "parse_apl", "parse_metal", "parse_wgsl", + "parse_tensorflow_flags", +] + +TS_ARTIFACT = ' return("Standalone function with parameters")' + + +def _patch_for_v1_scope() -> None: + import importlib + + # the package re-exports the function; we need the module itself + pf = importlib.import_module("tree_plus_src.parse_file") + + for name in DEFERRED_PARSERS: + if hasattr(pf, name): + setattr(pf, name, lambda *a, **k: []) + original_parse_ts = pf.parse_ts + + def parse_ts_no_artifact(*a, **k): + return [c for c in original_parse_ts(*a, **k) if c != TS_ARTIFACT] + + pf.parse_ts = parse_ts_no_artifact + # the engine's read_file/parse caches must not leak between runs + pf.read_file.cache_clear() + + +def main() -> None: + (OUT / "components").mkdir(parents=True, exist_ok=True) + (OUT / "trees").mkdir(parents=True, exist_ok=True) + (OUT / "counts").mkdir(parents=True, exist_ok=True) + + n_components = 0 + n_counts = 0 + for fixture_dir in FIXTURE_DIRS: + for path in sorted(fixture_dir.rglob("*")): + if not path.is_file(): + continue + if "__pycache__" in path.parts or path.suffix == ".pyc": + continue + # skip anything the default ignore set would hide (test artifacts + # like .hypothesis, caches, compiled files) + if tree_plus.should_ignore(path, tree_plus.DEFAULT_IGNORE): + continue + rel = str(path.relative_to(REPO)) + try: + components = tree_plus.parse_file(rel) + except Exception as e: + components = [f"__EXCEPTION__: {type(e).__name__}: {e}"] + (OUT / "components" / f"{sanitize(path)}.json").write_text( + json.dumps( + {"path": rel, "components": components}, + indent=1, + ensure_ascii=False, + ) + ) + n_components += 1 + try: + count = tree_plus.count_tokens_lines(rel) + except Exception: + count = None + payload = ( + None + if count is None + else {"n_tokens": count.n_tokens, "n_lines": count.n_lines} + ) + (OUT / "counts" / f"{sanitize(path)}.json").write_text( + json.dumps({"path": rel, "count": payload}, ensure_ascii=False) + ) + n_counts += 1 + + for name, seeds in TREE_TARGETS.items(): + root = engine.from_seeds(seeds, tokenizer_name=tree_plus.TokenizerName.WC) + (OUT / "trees" / f"{name}.txt").write_text(root.into_str()) + + # v1-scope tree goldens: legacy output with extractors outside the Rust + # port's version-1 scope stubbed out (deferred languages keep markers but + # lose components), plus documented intentional differences. + _patch_for_v1_scope() + (OUT / "trees_v1").mkdir(parents=True, exist_ok=True) + for name, seeds in TREE_TARGETS.items(): + root = engine.from_seeds(seeds, tokenizer_name=tree_plus.TokenizerName.WC) + (OUT / "trees_v1" / f"{name}.txt").write_text(root.into_str()) + + # concise render of the whole repo (deterministic, no parsing) + root = engine.from_seeds( + (str(REPO),), tokenizer_name=tree_plus.TokenizerName.WC, concise=True + ) + concise = root.into_str() + # the root label embeds the repo dir name only; keep as-is + (OUT / "trees" / "repo_concise.txt").write_text(concise) + + print(f"wrote {n_components} component files, {n_counts} counts," + f" {len(TREE_TARGETS) + 1} trees -> {OUT}") + + +if __name__ == "__main__": + main() diff --git a/tests/golden/legacy/components/tests__dot_dot__my_test_file.py.json b/tests/golden/legacy/components/tests__dot_dot__my_test_file.py.json new file mode 100644 index 0000000..bd87a0a --- /dev/null +++ b/tests/golden/legacy/components/tests__dot_dot__my_test_file.py.json @@ -0,0 +1,6 @@ +{ + "path": "tests/dot_dot/my_test_file.py", + "components": [ + "def dot_dot_dot()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__dot_dot__nested_dir__.env.test.json b/tests/golden/legacy/components/tests__dot_dot__nested_dir__.env.test.json new file mode 100644 index 0000000..9020f5f --- /dev/null +++ b/tests/golden/legacy/components/tests__dot_dot__nested_dir__.env.test.json @@ -0,0 +1,6 @@ +{ + "path": "tests/dot_dot/nested_dir/.env.test", + "components": [ + "DEBUG_TREE_PLUS" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__dot_dot__nested_dir__pytest.ini.json b/tests/golden/legacy/components/tests__dot_dot__nested_dir__pytest.ini.json new file mode 100644 index 0000000..30376bf --- /dev/null +++ b/tests/golden/legacy/components/tests__dot_dot__nested_dir__pytest.ini.json @@ -0,0 +1,4 @@ +{ + "path": "tests/dot_dot/nested_dir/pytest.ini", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__dot_dot__nested_dir__test_tp_dotdot.py.json b/tests/golden/legacy/components/tests__dot_dot__nested_dir__test_tp_dotdot.py.json new file mode 100644 index 0000000..7013c5f --- /dev/null +++ b/tests/golden/legacy/components/tests__dot_dot__nested_dir__test_tp_dotdot.py.json @@ -0,0 +1,7 @@ +{ + "path": "tests/dot_dot/nested_dir/test_tp_dotdot.py", + "components": [ + "def ignore_tokens_lines_test(text: str) -> str", + "def test_tree_plus_dotdot()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json b/tests/golden/legacy/components/tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json new file mode 100644 index 0000000..c3d6441 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json @@ -0,0 +1,40 @@ +{ + "path": "tests/more_languages/group1/CUSTOMER-INVOICE.CBL", + "components": [ + "IDENTIFICATION DIVISION.", + "PROGRAM-ID. CUSTOMER-INVOICE.", + "AUTHOR. JANE DOE.", + "DATE. 2023-12-30.", + " DATE-COMPILED. 06/30/10.", + " DATE-WRITTEN. 12/34/56.", + "ENVIRONMENT DIVISION.", + "INPUT-OUTPUT SECTION.", + "FILE-CONTROL.", + " SELECT CUSTOMER-FILE.", + " SELECT INVOICE-FILE.", + " SELECT REPORT-FILE.", + "DATA DIVISION.", + "FILE SECTION.", + "FD CUSTOMER-FILE.", + "01 CUSTOMER-RECORD.", + " 05 CUSTOMER-ID.", + " 05 CUSTOMER-NAME.", + " 05 CUSTOMER-BALANCE.", + "FD INVOICE-FILE.", + "01 INVOICE-RECORD.", + " 05 INVOICE-ID.", + " 05 CUSTOMER-ID.", + " 05 INVOICE-AMOUNT.", + "FD REPORT-FILE.", + "01 REPORT-RECORD.", + "WORKING-STORAGE SECTION.", + "01 WS-CUSTOMER-FOUND.", + "01 WS-END-OF-FILE.", + "01 WS-TOTAL-BALANCE.", + "PROCEDURE DIVISION.", + "0000-MAIN-ROUTINE.", + "1000-PROCESS-RECORDS.", + "1100-UPDATE-CUSTOMER-BALANCE.", + "END PROGRAM CUSTOMER-INVOICE." + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__JavaTest.java.json b/tests/golden/legacy/components/tests__more_languages__group1__JavaTest.java.json new file mode 100644 index 0000000..45b740d --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__JavaTest.java.json @@ -0,0 +1,29 @@ +{ + "path": "tests/more_languages/group1/JavaTest.java", + "components": [ + "abstract class LivingBeing", + " abstract void breathe()", + "interface Communicator", + " String communicate()", + "@Log", + "@Getter", + "@Setter", + "class Person extends LivingBeing implements Communicator", + " Person(String name, int age)", + " @Override", + " void breathe()", + " @Override", + " public String communicate()", + " void greet()", + " String personalizedGreeting(String greeting, Optional includeAge)", + "@Singleton", + "@RestController", + "@SpringBootApplication", + "public class Example", + " @Inject", + " public Example(Person person)", + " @RequestMapping(\"/greet\")", + " String home(@RequestParam(value = \"name\", defaultValue = \"World\") String name,\n @RequestParam(value = \"age\", defaultValue = \"30\") int age)", + " public static void main(String[] args)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__JuliaTest.jl.json b/tests/golden/legacy/components/tests__more_languages__group1__JuliaTest.jl.json new file mode 100644 index 0000000..65e67ee --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__JuliaTest.jl.json @@ -0,0 +1,17 @@ +{ + "path": "tests/more_languages/group1/JuliaTest.jl", + "components": [ + "module JuliaTest_EdgeCase", + "struct Location\n name::String \n lat::Float32\n lon::Float32\nend", + "mutable struct mPerson\n name::String\n age::Int\nend", + "Base.@kwdef mutable struct Param\n Δt::Float64 = 0.1\n n::Int64\n m::Int64\nend", + " sic(x,y)", + "welcome(l::Location)", + "∑(α, Ω)", + "function noob()\nend", + "function ye_olde(hello::String, world::Location)\nend", + "function multiline_greet(\n p::mPerson, \n greeting::String\n )\nend", + "function julia_is_awesome(prob::DiffEqBase.AbstractDAEProblem{uType, duType, tType,\n isinplace};\n kwargs...) where {uType, duType, tType, isinplace}\nend", + "end" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__KotlinTest.kt.json b/tests/golden/legacy/components/tests__more_languages__group1__KotlinTest.kt.json new file mode 100644 index 0000000..bedeb34 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__KotlinTest.kt.json @@ -0,0 +1,52 @@ +{ + "path": "tests/more_languages/group1/KotlinTest.kt", + "components": [ + "data class Person(val name: String)", + "fun greet(person: Person)", + "fun processItems(items: List, processor: (T) -> Unit)", + "interface Source", + " fun nextT(): T", + "fun MutableList.swap(index1: Int, index2: Int)", + "fun Any?.toString(): String", + "tailrec fun findFixPoint(x: Double = 1.0): Double", + "class GenericRepository", + " fun getItem(id: Int): T?", + "sealed interface Error", + "sealed class IOError(): Error", + "object Runner", + " inline fun , T> run() : T", + "infix fun Int.shl(x: Int): Int", + "class MyStringCollection", + " infix fun add(s: String)", + " fun build()", + "open class Base(p: Int)", + "class Derived(p: Int) : Base(p)", + "open class Shape", + " open fun draw()", + " fun fill()", + " open fun edge(case: Int)", + "interface Thingy", + " fun edge()", + "class Circle() : Shape(), Thingy", + " override fun draw()", + " final override fun edge(case: Int)", + "interface Base", + " fun print()", + "class BaseImpl(val x: Int) : Base", + " override fun print()", + "internal class Derived(b: Base) : Base by b", + "class Person constructor(firstName: String)", + "class People(\n firstNames: Array,\n ages: Array(42),\n)", + " fun edgeCases(): Boolean", + "class Alien public @Inject constructor(\n val firstName: String,\n val lastName: String,\n var age: Int,\n val pets: MutableList = mutableListOf(),\n)", + " fun objectOriented(): String", + " enum class IntArithmetics : BinaryOperator, IntBinaryOperator", + " PLUS {\n override fun apply(t: Int, u: Int): Int", + " TIMES {\n override fun apply(t: Int, u: Int): Int", + " override fun applyAsInt(t: Int, u: Int)", + "fun reformat(\n str: String,\n normalizeCase: Boolean = true,\n upperCaseFirstLetter: Boolean = true,\n divideByCamelHumps: Boolean = false,\n wordSeparator: Char = ' ',\n)", + "operator fun Point.unaryMinus()", + "abstract class Polygon", + " abstract fun draw()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__LuaTest.lua.json b/tests/golden/legacy/components/tests__more_languages__group1__LuaTest.lua.json new file mode 100644 index 0000000..c9da7e1 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__LuaTest.lua.json @@ -0,0 +1,8 @@ +{ + "path": "tests/more_languages/group1/LuaTest.lua", + "components": [ + "function HelloWorld.new", + "function HelloWorld.greet", + "function say_hello" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__ObjectiveCTest.m.json b/tests/golden/legacy/components/tests__more_languages__group1__ObjectiveCTest.m.json new file mode 100644 index 0000000..a6d7343 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__ObjectiveCTest.m.json @@ -0,0 +1,10 @@ +{ + "path": "tests/more_languages/group1/ObjectiveCTest.m", + "components": [ + "@interface HelloWorld", + "@interface HelloWorld -> (void) sayHello", + "@implementation HelloWorld", + "@implementation HelloWorld -> (void) sayHello", + "void sayHelloWorld()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__OcamlTest.ml.json b/tests/golden/legacy/components/tests__more_languages__group1__OcamlTest.ml.json new file mode 100644 index 0000000..e4d37de --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__OcamlTest.ml.json @@ -0,0 +1,9 @@ +{ + "path": "tests/more_languages/group1/OcamlTest.ml", + "components": [ + "type color", + "class hello", + "class hello -> method say_hello", + "let main ()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__addamt.cobol.json b/tests/golden/legacy/components/tests__more_languages__group1__addamt.cobol.json new file mode 100644 index 0000000..d906d17 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__addamt.cobol.json @@ -0,0 +1,20 @@ +{ + "path": "tests/more_languages/group1/addamt.cobol", + "components": [ + "IDENTIFICATION DIVISION.", + "PROGRAM-ID.\n ADDAMT.", + "DATA DIVISION.", + "WORKING-STORAGE SECTION.", + "01 KEYED-INPUT.", + " 05 CUST-NO-IN.", + " 05 AMT1-IN.", + " 05 AMT2-IN.", + " 05 AMT3-IN.", + "01 DISPLAYED-OUTPUT.", + " 05 CUST-NO-OUT.", + " 05 TOTAL-OUT.", + "01 MORE-DATA.", + "PROCEDURE DIVISION.", + "100-MAIN." + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__lesson.cbl.json b/tests/golden/legacy/components/tests__more_languages__group1__lesson.cbl.json new file mode 100644 index 0000000..45182c6 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__lesson.cbl.json @@ -0,0 +1,45 @@ +{ + "path": "tests/more_languages/group1/lesson.cbl", + "components": [ + "IDENTIFICATION DIVISION.", + "PROGRAM-ID. CBL0002.", + "AUTHOR. Otto B. Fun.", + "ENVIRONMENT DIVISION.", + "INPUT-OUTPUT SECTION.", + "FILE-CONTROL.", + " SELECT PRINT-LINE.", + " SELECT ACCT-REC.", + "DATA DIVISION.", + "FILE SECTION.", + "FD PRINT-LINE.", + "01 PRINT-REC.", + " 05 ACCT-NO-O.", + " 05 ACCT-LIMIT-O.", + " 05 ACCT-BALANCE-O.", + " 05 LAST-NAME-O.", + " 05 FIRST-NAME-O.", + " 05 COMMENTS-O.", + "FD ACCT-REC.", + "01 ACCT-FIELDS.", + " 05 ACCT-NO.", + " 05 ACCT-LIMIT.", + " 05 ACCT-BALANCE.", + " 05 LAST-NAME.", + " 05 FIRST-NAME.", + " 05 CLIENT-ADDR.", + " 10 STREET-ADDR.", + " 10 CITY-COUNTY.", + " 10 USA-STATE.", + " 05 RESERVED.", + " 05 COMMENTS.", + "WORKING-STORAGE SECTION.", + "01 FLAGS.", + " 05 LASTREC.", + "PROCEDURE DIVISION.", + "OPEN-FILES.", + "READ-NEXT-RECORD.", + "CLOSE-STOP.", + "READ-RECORD.", + "WRITE-RECORD." + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__test.js.json b/tests/golden/legacy/components/tests__more_languages__group1__test.js.json new file mode 100644 index 0000000..acd47be --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__test.js.json @@ -0,0 +1,40 @@ +{ + "path": "tests/more_languages/group1/test.js", + "components": [ + "class MyClass", + " myMethod()", + " async asyncMethod(a, b)", + " methodWithDefaultParameters(a = 5, b = 10)", + " multilineMethod(\n c,\n d\n )", + " multilineMethodWithDefaults(\n t = \"tree\",\n p = \"plus\"\n )", + "function myFunction(param1, param2)", + "function multilineFunction(\n param1,\n param2\n)", + "const arrowFunction = () =>", + "const parametricArrow = (a, b) =>", + "function ()", + "function outerFunction(outerParam)", + " function innerFunction(innerParam)", + " innerFunction(\"inner\")", + "const myObject = {", + " myMethod: function (stuff)", + "let myArrowObject = {", + " myArrow: ({\n a,\n b,\n c,\n }) =>", + "const myAsyncArrowFunction = async () =>", + "function functionWithRestParameters(...args)", + "const namedFunctionExpression = function myNamedFunction()", + "const multilineArrowFunction = (\n a,\n b\n) =>", + "function functionReturningFunction()", + " return function ()", + "function destructuringOnMultipleLines({\n a,\n b,\n})", + "const arrowFunctionWithDestructuring = ({ a, b }) =>", + "const multilineDestructuringArrow = ({\n a,\n b,\n}) =>", + "async function asyncFunctionWithErrorHandling()", + "class Car", + " constructor(brand)", + " present()", + "class Model extends Car", + " constructor(brand, mod)", + " super(brand)", + " show()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group1__test.ts.json b/tests/golden/legacy/components/tests__more_languages__group1__test.ts.json new file mode 100644 index 0000000..9ae0272 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group1__test.ts.json @@ -0,0 +1,41 @@ +{ + "path": "tests/more_languages/group1/test.ts", + "components": [ + "type MyType", + "interface MyInterface", + "class TsClass", + " myMethod()", + " myMethodWithArgs(param1: string, param2: number): void", + " static myStaticMethod(param: T): T", + " multilineMethod(\n c: number,\n d: number\n ): number", + " multilineMethodWithDefaults(\n t: string = \"tree\",\n p: string = \"plus\"\n ): string", + "export class AdvancedComponent implements MyInterface", + " async myAsyncMethod(\n a: string,\n b: number,\n c: string\n ): Promise", + " genericMethod(\n arg1: T,\n arg2: U\n ): [T, U]", + "export class TicketsComponent implements MyInterface", + " async myAsyncMethod({ a, b, c }: { a: String; b: Number; c: String })", + "function tsFunction()", + "function tsFunctionSigned(\n param1: number,\n param2: number\n): void", + "export default async function tsFunctionComplicated({\n a = 1 | 2,\n b = \"bob\",\n c = async () => \"charlie\",\n}: {\n a: number;\n b: string;\n c: () => Promise;\n}): Promise", + " return(\"Standalone function with parameters\")", + "const tsArrowFunctionSigned = ({\n a,\n b,\n}: {\n a: number;\n b: string;\n}) =>", + "export const tsComplicatedArrow = async ({\n a = 1 | 2,\n b = \"bob\",\n c = async () => \"charlie\",\n}: {\n a: number;\n b: string;\n c: () => Promise;\n}): Promise =>", + "const arrowFunction = () =>", + "const arrow = (a: String, b: Number) =>", + "const asyncArrowFunction = async () =>", + "const asyncArrow = async (a: String, b: Number) =>", + "let weirdArrow = () =>", + "const asyncPromiseArrow = async (): Promise =>", + "let myWeirdArrowSigned = (x: number): number =>", + "class Person", + " constructor(private firstName: string, private lastName: string)", + " getFullName(): string", + " describe(): string", + "class Employee extends Person", + " constructor(\n firstName: string,\n lastName: string,\n private jobTitle: string\n )", + " super(firstName, lastName)", + " describe(): string", + "interface Shape", + "interface Square extends Shape" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group2__PerlTest.pl.json b/tests/golden/legacy/components/tests__more_languages__group2__PerlTest.pl.json new file mode 100644 index 0000000..052b564 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group2__PerlTest.pl.json @@ -0,0 +1,9 @@ +{ + "path": "tests/more_languages/group2/PerlTest.pl", + "components": [ + "package PerlTest", + "package PerlTest -> sub new", + "package PerlTest -> sub hello", + "package PerlTest -> sub say_hello" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group2__PhpTest.php.json b/tests/golden/legacy/components/tests__more_languages__group2__PhpTest.php.json new file mode 100644 index 0000000..42ef00f --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group2__PhpTest.php.json @@ -0,0 +1,10 @@ +{ + "path": "tests/more_languages/group2/PhpTest.php", + "components": [ + "class HelloWorld", + "class HelloWorld -> function sayHello", + "function greet", + "class Person", + "class Person -> function __construct" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group2__PowershellTest.ps1.json b/tests/golden/legacy/components/tests__more_languages__group2__PowershellTest.ps1.json new file mode 100644 index 0000000..db3bdfc --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group2__PowershellTest.ps1.json @@ -0,0 +1,27 @@ +{ + "path": "tests/more_languages/group2/PowershellTest.ps1", + "components": [ + "function Say-Nothing()", + "class Person", + " Person([string]$name)", + " [string]Greet()", + " [string]GreetMany([int]$times)", + " [string]GreetWithDetails([string]$greeting, [int]$times)", + " [string]GreetMultiline(\n [string]$greeting,\n [int]$times\n )", + " NoReturn([int]$times)", + " NoReturnNoArgs()", + "function Say-Hello([Person]$person)", + "function Multi-Hello([Person]$personA, [Person]$personB)", + "function Switch-Item", + " param ([switch]$on)", + "function Get-SmallFiles", + " param (\n [PSDefaultValue(Help = '100')]\n $Size = 100)", + "function Get-User", + " [CmdletBinding(DefaultParameterSetName=\"ID\")]", + " [OutputType(\"System.Int32\", ParameterSetName=\"ID\")]", + " [OutputType([String], ParameterSetName=\"Name\")]", + " Param (\n [parameter(Mandatory=$true, ParameterSetName=\"ID\")]\n [Int[]]\n $UserID,\n [parameter(Mandatory=$true, ParameterSetName=\"Name\")]\n [String[]]\n $UserName)", + "filter Get-ErrorLog ([switch]$Message)", + "function global:MultilineSignature(\n [string]$param1,\n [int]$param2,\n [Parameter(Mandatory=$true)]\n [string]$param3\n)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group2__ScalaTest.scala.json b/tests/golden/legacy/components/tests__more_languages__group2__ScalaTest.scala.json new file mode 100644 index 0000000..f21daf1 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group2__ScalaTest.scala.json @@ -0,0 +1,16 @@ +{ + "path": "tests/more_languages/group2/ScalaTest.scala", + "components": [ + "def sumOfSquares(x: Int, y: Int): Int", + "trait Bark", + " def bark: String", + "case class Person(name: String)", + "class GenericClass[T](\n val data: T,\n val count: Int\n)", + " def getData: T", + "object HelloWorld", + " def greet(person: Person): Unit", + " def main(args: Array[String]): Unit", + "def complexFunction(\n a: Int,\n b: String,\n c: Float\n): (Int, String) Option", + "def sumOfSquaresShort(x: Int, y: Int): Int" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group2__apl_test.apl.json b/tests/golden/legacy/components/tests__more_languages__group2__apl_test.apl.json new file mode 100644 index 0000000..b4d2954 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group2__apl_test.apl.json @@ -0,0 +1,8 @@ +{ + "path": "tests/more_languages/group2/apl_test.apl", + "components": [ + ":Namespace HelloWorld", + ":Namespace HelloWorld -> hello ← 'Hello, World!'", + ":Namespace HelloWorld -> plus ← {⍺+⍵}" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group2__c_test.c.json b/tests/golden/legacy/components/tests__more_languages__group2__c_test.c.json new file mode 100644 index 0000000..cc70367 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group2__c_test.c.json @@ -0,0 +1,39 @@ +{ + "path": "tests/more_languages/group2/c_test.c", + "components": [ + "struct Point", + " int x;", + " int y;", + "struct Point getOrigin()", + "float mul_two_floats(float x1, float x2)", + "enum days", + " SUN,", + " MON,", + " TUE,", + " WED,", + " THU,", + " FRI,", + " SAT", + "long add_two_longs(long x1, long x2)", + "double multiplyByTwo(double num)", + "char getFirstCharacter(char *str)", + "void greet(Person p)", + "typedef struct", + " char name[50];", + "} Person;", + "int main()", + "int* getArrayStart(int arr[], int size)", + "long complexFunctionWithMultipleArguments(\n int param1,\n double param2,\n char *param3,\n struct Point point\n)", + "keyPattern *ACLKeyPatternCreate(sds pattern, int flags)", + "sds sdsCatPatternString(sds base, keyPattern *pat)", + "static int ACLCheckChannelAgainstList(list *reference, const char *channel, int channellen, int is_pattern)", + " while((ln = listNext(&li)))", + "static struct config", + " aeEventLoop *el;", + " cliConnInfo conn_info;", + " const char *hostsocket;", + " int tls;", + " struct cliSSLconfig sslconfig;", + "} config;" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group2__go_test.go.json b/tests/golden/legacy/components/tests__more_languages__group2__go_test.go.json new file mode 100644 index 0000000..c2d22da --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group2__go_test.go.json @@ -0,0 +1,12 @@ +{ + "path": "tests/more_languages/group2/go_test.go", + "components": [ + "type Greeting struct", + "func (g Greeting) sayHello()", + "func createGreeting(m string) Greeting", + "type SomethingLong struct", + "func (s *SomethingLong) WithAReasonableName(\n\tctx context.Context,\n\tparam1 string,\n\tparam2 int,\n\tparam3 map[string]interface{},\n\tcallback func(int) error,\n) (resultType, error)", + "type resultType struct", + "func main()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group2__test.csv.json b/tests/golden/legacy/components/tests__more_languages__group2__test.csv.json new file mode 100644 index 0000000..00c8f4e --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group2__test.csv.json @@ -0,0 +1,10 @@ +{ + "path": "tests/more_languages/group2/test.csv", + "components": [ + "Name", + "Age", + "Country", + "City", + "Email" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__bash_test.sh.json b/tests/golden/legacy/components/tests__more_languages__group3__bash_test.sh.json new file mode 100644 index 0000000..345abd9 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__bash_test.sh.json @@ -0,0 +1,11 @@ +{ + "path": "tests/more_languages/group3/bash_test.sh", + "components": [ + "echo_hello_world()", + "function fun_echo_hello_world()", + "export SECRET", + "alias md='make debug'", + "add_alias()", + "create_conda_env()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__cpp_test.cpp.json b/tests/golden/legacy/components/tests__more_languages__group3__cpp_test.cpp.json new file mode 100644 index 0000000..d307291 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__cpp_test.cpp.json @@ -0,0 +1,74 @@ +{ + "path": "tests/more_languages/group3/cpp_test.cpp", + "components": [ + "class Person", + " std::string name;", + "public:", + " Person(std::string n) : name(n)", + " void greet()", + "void globalGreet()", + "int main()", + "void printMessage(const std::string &message)", + "template\nvoid printVector(const std::vector& vec)", + "struct Point", + " int x, y;", + " Point(int x, int y) : x(x), y(y)", + "class Animal", + "public:", + " Animal(const std::string &name) : name(name)", + " virtual void speak() const", + " virtual ~Animal()", + "protected:", + " std::string name;", + "class Dog : public Animal", + "public:", + " Dog(const std::string &name) : Animal(name)", + " void speak() const override", + "class Cat : public Animal", + "public:", + " Cat(const std::string &name) : Animal(name)", + " void speak() const override", + "nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int num_layers,\n int batch_size, int max_seq_length, float dropout,\n bool bidirectional, bool cudnn_allow_tf32,\n\t\t\t int workspace_size, int reserve_space_size)", + "int main()", + "enum ECarTypes", + " Sedan,", + " Hatchback,", + " SUV,", + " Wagon", + "ECarTypes GetPreferredCarType()", + "enum ECarTypes : uint8_t", + " Sedan,", + " Hatchback,", + " SUV = 254,", + " Hybrid", + "enum class ECarTypes : uint8_t", + " Sedan,", + " Hatchback,", + " SUV = 254,", + " Hybrid", + "void myFunction(string fname, int age)", + "template T cos(T)", + "template T sin(T)", + "template T sqrt(T)", + "template struct VLEN", + "template class arr", + " private:", + " static T *ralloc(size_t num)", + " static void dealloc(T *ptr)", + " static T *ralloc(size_t num)", + " static void dealloc(T *ptr)", + " public:", + " arr() : p(0), sz(0)", + " arr(size_t n) : p(ralloc(n)), sz(n)", + " arr(arr &&other)\n : p(other.p), sz(other.sz)", + " ~arr()", + " void resize(size_t n)", + " T &operator[](size_t idx)", + " T *data()", + " size_t size() const", + "class Buffer", + " private:", + " void* ptr_;", + "std::tuple quantize(\n const array& w,\n int group_size,\n int bits,\n StreamOrDevice s)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__csharp_test.cs.json b/tests/golden/legacy/components/tests__more_languages__group3__csharp_test.cs.json new file mode 100644 index 0000000..9dd7b8c --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__csharp_test.cs.json @@ -0,0 +1,48 @@ +{ + "path": "tests/more_languages/group3/csharp_test.cs", + "components": [ + "public interface IExcelTemplate", + " void LoadTemplate(string templateFilePath)", + " void LoadData(Dictionary data)", + " void ModifyCell(string cellName, string value)", + " void SaveToFile(string filePath)", + "public interface IGreet", + " void Greet()", + "public enum WeekDays", + "public delegate void DisplayMessage(string message)", + "public struct Address", + "public static class HelperFunctions", + " public static void PrintMessage(string message)", + " public static int AddNumbers(int a, int b)", + "namespace HelloWorldApp", + " class Person : IGreet", + " public Person(string name, int age)", + " public void Greet()", + " class HelloWorld", + " static void Main(string[] args)", + "namespace TemplateToExcelServer.Template", + " public interface ITemplateObject", + " string[,] GetContent()", + " string[] GetContentArray()", + " string[] GetFormat()", + " int? GetFormatLength()", + " TemplateObject SetContent(string[,] Content)", + " TemplateObject SetContentArray(string[] value)", + " TemplateObject SetFormat(string[] Header)", + " TemplateObject SetNameOfReport(\n ReadOnlyMemory ReportName,\n int[] EdgeCase)", + " TemplateObject SetSheetName(ReadOnlyMemory SheetName)", + "public class BankAccount(string accountID, string owner)", + " public override string ToString() =>", + "var IncrementBy = (int source, int increment = 1) =>", + "Func add = (x, y) =>", + "button.Click += (sender, args) =>", + "public Func GetMultiplier(int factor)", + "public void Method(\n int param1,\n int param2,\n int param3,\n int param4,\n int param5,\n int param6,\n )", + "System.Net.ServicePointManager.ServerCertificateValidationCallback +=\n (se, cert, chain, sslerror) =>", + "class ServerCertificateValidation", + " public bool OnRemoteCertificateValidation(\n object se,\n X509Certificate cert,\n X509Chain chain,\n SslPolicyErrors sslerror\n )", + "s_downloadButton.Clicked += async (o, e) =>", + "[HttpGet, Route(\"DotNetCount\")]", + "static public async Task GetDotNetCount(string URL)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__hallucination.tex.json b/tests/golden/legacy/components/tests__more_languages__group3__hallucination.tex.json new file mode 100644 index 0000000..555c5f8 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__hallucination.tex.json @@ -0,0 +1,30 @@ +{ + "path": "tests/more_languages/group3/hallucination.tex", + "components": [ + "Harnessing the Master Algorithm: Strategies for AI LLMs to Mitigate Hallucinations", + "Hallucinated Pedro Domingos et al.", + "Christmas Eve 2023", + "1 Introduction", + "2 Representation in LLMs", + " 2.1 Current Representational Models", + " 2.2 Incorporating Cognitive Structures", + " 2.3 Conceptual Diagrams of Advanced Representational Models", + "3 Evaluation Strategies", + " 3.1 Existing Evaluation Metrics for LLMs", + " 3.2 Integrating Contextual and Ethical Considerations", + " 3.3 Case Studies: Evaluation in Practice", + "4 Optimization Techniques", + " 4.1 Continuous Learning Models", + " 4.2 Adaptive Algorithms for Real-time Adjustments", + " 4.3 Performance Metrics Pre- and Post-Optimization", + "5 Interdisciplinary Insights", + " 5.1 Cognitive Science and AI: A Symbiotic Relationship", + " 5.2 Learning from Human Cognitive Processes", + "6 Challenges and Future Directions", + " 6.1 Addressing Current Limitations", + " 6.2 The Road Ahead: Ethical and Practical Considerations", + "7 Conclusion", + " 7.1 Summarizing Key Findings", + " 7.2 The Next Steps in AI Development" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__ruby_test.rb.json b/tests/golden/legacy/components/tests__more_languages__group3__ruby_test.rb.json new file mode 100644 index 0000000..4552c04 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__ruby_test.rb.json @@ -0,0 +1,14 @@ +{ + "path": "tests/more_languages/group3/ruby_test.rb", + "components": [ + "module Greeter", + " def self.say_hello", + "class HelloWorld", + " def say_hello", + "class Human", + " def self.bar", + " def self.bar=(value)", + "class Doctor < Human", + " def brachial_plexus(\n roots,\n trunks,\n divisions: true,\n cords: [],\n branches: Time.now\n )" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__swift_test.swift.json b/tests/golden/legacy/components/tests__more_languages__group3__swift_test.swift.json new file mode 100644 index 0000000..41f385f --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__swift_test.swift.json @@ -0,0 +1,23 @@ +{ + "path": "tests/more_languages/group3/swift_test.swift", + "components": [ + "class Person", + " init(name: String)", + " func greet()", + " func yEdgeCase(\n fname: String, \n lname: String, \n age: Int,\n address: String, \n phoneNumber: String\n )", + "func globalGreet()", + "struct Point", + "protocol Animal", + " func speak()", + "struct Dog: Animal", + "class Cat: Animal", + " init(name: String)", + " func speak()", + "enum CarType", + "func getPreferredCarType() -> CarType", + "enum CarType: UInt8", + "enum class CarType: UInt8", + "func myFunction(fname: String, age: Int)", + "func myFunctionWithMultipleParameters(\n fname: String, \n lname: String, \n age: Int, \n address: String, \n phoneNumber: String\n)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test.capnp.json b/tests/golden/legacy/components/tests__more_languages__group3__test.capnp.json new file mode 100644 index 0000000..c42af21 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test.capnp.json @@ -0,0 +1,23 @@ +{ + "path": "tests/more_languages/group3/test.capnp", + "components": [ + "struct Employee", + " id @0 :Int32", + " name @1 :Text", + " role @2 :Text", + " skills @3 :List(Skill)", + " struct Skill", + " name @0 :Text", + " level @1 :Level", + " enum Level", + " beginner @0", + " intermediate @1", + " expert @2", + " status :union", + " active @4 :Void", + " onLeave @5 :Void", + " retired @6 :Void", + "struct Company", + " employees @0 :List(Employee)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test.graphql.json b/tests/golden/legacy/components/tests__more_languages__group3__test.graphql.json new file mode 100644 index 0000000..e59dd5f --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test.graphql.json @@ -0,0 +1,19 @@ +{ + "path": "tests/more_languages/group3/test.graphql", + "components": [ + "type Query", + " getBooks: [Book]", + " getAuthors: [Author]", + "type Mutation", + " addBook(title: String, author: String): Book", + " removeBook(id: ID): Book", + "type Book", + " id: ID", + " title: String", + " author: Author", + "type Author", + " id: ID", + " name: String", + " books: [Book]" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test.lean.json b/tests/golden/legacy/components/tests__more_languages__group3__test.lean.json new file mode 100644 index 0000000..b6734bb --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test.lean.json @@ -0,0 +1,15 @@ +{ + "path": "tests/more_languages/group3/test.lean", + "components": [ + "# Advanced Topics in Group Theory", + "section GroupDynamics", + "lemma group_stability (G : Type*) [Group G] (H : Subgroup G)", + "theorem subgroup_closure {G : Type*} [Group G] (S : Set G)", + "axiom group_homomorphism_preservation {G H : Type*} [Group G] [Group H] (f : G → H)", + "end GroupDynamics", + "section ConstructiveApproach", + "lemma finite_group_order (G : Type*) [Group G] [Fintype G]", + "lemma complex_lemma {X Y : Type*} [SomeClass X] [AnotherClass Y]\n (f : X → Y) (g : Y → X)", + "end ConstructiveApproach" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test.proto.json b/tests/golden/legacy/components/tests__more_languages__group3__test.proto.json new file mode 100644 index 0000000..39d378a --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test.proto.json @@ -0,0 +1,23 @@ +{ + "path": "tests/more_languages/group3/test.proto", + "components": [ + "syntax = \"proto3\"", + "service EmployeeService", + " rpc GetEmployee(EmployeeId) returns (EmployeeInfo)", + " rpc AddEmployee(EmployeeData) returns (EmployeeInfo)", + " rpc UpdateEmployee(EmployeeUpdate) returns (EmployeeInfo)", + "message EmployeeId", + " int32 id = 1", + "message EmployeeInfo", + " int32 id = 1", + " string name = 2", + " string role = 3", + "message EmployeeData", + " string name = 1", + " string role = 2", + "message EmployeeUpdate", + " int32 id = 1", + " string name = 2", + " string role = 3" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test.sqlite.json b/tests/golden/legacy/components/tests__more_languages__group3__test.sqlite.json new file mode 100644 index 0000000..c604a8b --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test.sqlite.json @@ -0,0 +1,13 @@ +{ + "path": "tests/more_languages/group3/test.sqlite", + "components": [ + "students table:", + " id integer primary key", + " name text not null", + " age integer not null", + "courses table:", + " id integer primary key", + " title text not null", + " credits integer not null" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test_Cargo.toml.json b/tests/golden/legacy/components/tests__more_languages__group3__test_Cargo.toml.json new file mode 100644 index 0000000..a50605a --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test_Cargo.toml.json @@ -0,0 +1,12 @@ +{ + "path": "tests/more_languages/group3/test_Cargo.toml", + "components": [ + "name: test_cargo", + "version: 0.1.0", + "description: A test Cargo.toml", + "license: MIT OR Apache-2.0", + "dependencies:", + " clap 4.4", + " sqlx 0.7 (features: runtime-tokio, tls-rustls)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test_json_rpc_2_0.json.json b/tests/golden/legacy/components/tests__more_languages__group3__test_json_rpc_2_0.json.json new file mode 100644 index 0000000..764830e --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test_json_rpc_2_0.json.json @@ -0,0 +1,11 @@ +{ + "path": "tests/more_languages/group3/test_json_rpc_2_0.json", + "components": [ + "jsonrpc: 2.0", + "method: subtract", + "params:", + " minuend: 42", + " subtrahend: 23", + "id: 1" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test_openapi.yaml.json b/tests/golden/legacy/components/tests__more_languages__group3__test_openapi.yaml.json new file mode 100644 index 0000000..3a9f9d1 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test_openapi.yaml.json @@ -0,0 +1,16 @@ +{ + "path": "tests/more_languages/group3/test_openapi.yaml", + "components": [ + "openapi: 3.0.1", + " title: TODO Plugin", + " description: A plugin to create and manage TODO lists using ChatGPT.", + " version: v1", + "servers:", + " - url: PLUGIN_HOSTNAME", + "paths:", + " '/todos/{username}':", + " GET (getTodos): Get the list of todos", + " POST (addTodo): Add a todo to the list", + " DELETE (deleteTodo): Delete a todo from the list" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test_openrpc.json.json b/tests/golden/legacy/components/tests__more_languages__group3__test_openrpc.json.json new file mode 100644 index 0000000..0833942 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test_openrpc.json.json @@ -0,0 +1,14 @@ +{ + "path": "tests/more_languages/group3/test_openrpc.json", + "components": [ + "openrpc: 1.2.1", + "info:", + " title: Demo Petstore", + " version: 1.0.0", + "methods:", + " listPets: List all pets", + " params:", + " - limit: integer", + " result: pets = An array of pets" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group3__test_pyproject.toml.json b/tests/golden/legacy/components/tests__more_languages__group3__test_pyproject.toml.json new file mode 100644 index 0000000..dd96d4f --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group3__test_pyproject.toml.json @@ -0,0 +1,16 @@ +{ + "path": "tests/more_languages/group3/test_pyproject.toml", + "components": [ + "name: tree_plus", + "version: 1.0.8", + "description: A `tree` util enhanced with tokens, lines, and components.", + "License :: OSI Approved :: Apache Software License", + "License :: OSI Approved :: MIT License", + "dependencies:", + " tiktoken", + " PyYAML", + " click", + " rich", + " tomli" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__RTest.R.json b/tests/golden/legacy/components/tests__more_languages__group4__RTest.R.json new file mode 100644 index 0000000..70ff2a6 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__RTest.R.json @@ -0,0 +1,9 @@ +{ + "path": "tests/more_languages/group4/RTest.R", + "components": [ + "class(person)", + "greet.Person <- function", + "ensure_between = function", + "run_intermediate_annealing_process = function" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__erl_test.erl.json b/tests/golden/legacy/components/tests__more_languages__group4__erl_test.erl.json new file mode 100644 index 0000000..c7aeb6f --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__erl_test.erl.json @@ -0,0 +1,20 @@ +{ + "path": "tests/more_languages/group4/erl_test.erl", + "components": [ + "-module(erl_test).", + "-record(person).", + "-type ra_peer_status().", + "-type ra_membership().", + "-opaque my_opaq_type().", + "-type orddict(Key, Val).", + "-type edge(\n Cases,\n Pwn,\n ).", + "-spec guarded(X) -> X when X :: tuple().", + "-spec edge_case(\n {integer(), any()} | [any()]\n ) -> processed, integer(), any()} | [{item, any()}].", + "-spec complex_function({integer(), any()} | [any()]) -> \n {processed, integer(), any()} | [{item, any()}].", + "-spec list_manipulation([integer()]) -> [integer()].", + "-spec overload(T1, T2) -> T3\n ; (T4, T5) -> T6.", + "-spec multiguard({X, integer()}) -> X when X :: atom()\n ; ([Y]) -> Y when Y :: number().", + "-record(multiline).", + "-record(maybe_undefined)." + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__haskell_test.hs.json b/tests/golden/legacy/components/tests__more_languages__group4__haskell_test.hs.json new file mode 100644 index 0000000..668197a --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__haskell_test.hs.json @@ -0,0 +1,8 @@ +{ + "path": "tests/more_languages/group4/haskell_test.hs", + "components": [ + "data Person", + "greet :: Person -> String", + "resolveVariables ::\n forall m fragments.\n (MonadError QErr m, Traversable fragments) =>\n Options.BackwardsCompatibleNullInNonNullableVariables ->\n [G.VariableDefinition] ->\n GH.VariableValues ->\n [G.Directive G.Name] ->\n G.SelectionSet fragments G.Name ->\n m\n ( [G.Directive Variable],\n G.SelectionSet fragments Variable\n )" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__mathematica_test.nb.json b/tests/golden/legacy/components/tests__more_languages__group4__mathematica_test.nb.json new file mode 100644 index 0000000..2527c2d --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__mathematica_test.nb.json @@ -0,0 +1,8 @@ +{ + "path": "tests/more_languages/group4/mathematica_test.nb", + "components": [ + "person[name_]", + "sayHello[]", + "sumList[list_List]" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__matlab_test.m.json b/tests/golden/legacy/components/tests__more_languages__group4__matlab_test.m.json new file mode 100644 index 0000000..4fae43d --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__matlab_test.m.json @@ -0,0 +1,7 @@ +{ + "path": "tests/more_languages/group4/matlab_test.m", + "components": [ + "classdef HelloWorld -> function greet", + "function loneFun" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__rust_test.rs.json b/tests/golden/legacy/components/tests__more_languages__group4__rust_test.rs.json new file mode 100644 index 0000000..1cc0ca6 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__rust_test.rs.json @@ -0,0 +1,50 @@ +{ + "path": "tests/more_languages/group4/rust_test.rs", + "components": [ + "fn at_beginning<'a>(&'a str)", + "pub enum Days {\n #\\[default]\n Sun,\n Mon,\n #\\[error(\"edge case {idx}, expected at least {} and at most {}\", .limits.lo, .limits.hi)]\n Tue,\n Wed,\n Thu(i16, bool),\n Fri { day: u8 },\n Sat {\n urday: String,\n edge_case: E,\n },\n}", + "struct Point", + "impl Point", + " fn get_origin() -> Point", + "struct Person", + "impl Person", + " fn greet(&self)", + "fn add_two_longs(x1: i64, x2: i64) -> i64", + "fn add_two_longs_longer(\n x1: i64,\n x2: i64,\n) -> i64", + "const fn multiply_by_two(num: f64) -> f64", + "fn get_first_character(s: &str) -> Option", + "trait Drawable", + " fn draw(&self)", + "impl Drawable for Point", + " fn draw(&self)", + "fn with_generic(d: D)", + "fn with_generic(d: D)\nwhere \n D: Drawable", + "fn main()", + "pub struct VisibleStruct", + "mod my_module", + " pub struct AlsoVisibleStruct(T, T)", + "macro_rules! say_hello", + "#[macro_export]\nmacro_rules! hello_tree_plus", + "pub mod lib", + " pub mod interfaces", + " mod engine", + "pub fn flow(\n source: S1,\n extractor: E,\n inbox: S2,\n transformer: T,\n outbox: S3,\n loader: L,\n sink: &mut S4,\n) -> Result<(), Box>\nwhere\n S1: Extractable,\n S2: Extractable + Loadable,\n S3: Extractable + Loadable,\n S4: Loadable,\n E: Extractor,\n T: Transformer,\n L: Loader", + "trait Container", + " fn items(&self) -> impl Iterator", + "trait HttpService", + " async fn fetch(&self, url: Url) -> HtmlBody", + "struct Pair", + "trait Transformer", + " fn transform(&self, input: T) -> T", + "impl + Copy> Transformer for Pair", + " fn transform(&self, input: T) -> T", + "fn main()", + "async fn handle_get(State(pool): State) -> Result, (StatusCode, String)> \nwhere\n Bion: Cool", + "#[macro_export]\nmacro_rules! unit", + " fn insert(\n &mut self,\n key: (),\n value: $unit_dtype,\n ) -> Result, ETLError>", + "pub async fn handle_get_axum_route(\n Session { maybe_claims }: Session,\n Path(RouteParams {\n alpha,\n bravo,\n charlie,\n edge_case\n }): Path,\n) -> ServerResult", + "fn encode_pipeline(cmds: &[Cmd], atomic: bool) -> Vec", + "pub async fn handle_post_yeet(\n State(auth_backend): State,\n Session { maybe_claims }: Session,\n Form(yeet_form): Form,\n) -> Result", + "pub async fn handle_get_thingy(\n session: Session,\n State(ApiBackend {\n page_cache,\n auth_backend,\n library_sql,\n some_data_cache,\n metadata_cache,\n thingy_client,\n ..\n }): State,\n) -> ServerResult" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__test.zig.json b/tests/golden/legacy/components/tests__more_languages__group4__test.zig.json new file mode 100644 index 0000000..cf9f233 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__test.zig.json @@ -0,0 +1,11 @@ +{ + "path": "tests/more_languages/group4/test.zig", + "components": [ + "pub fn add(a: i32, b: i32) i32", + "test \"add function\"", + "const BunBuildOptions = struct", + " pub fn updateRuntime(this: *BunBuildOptions) anyerror!void", + " pub fn step(this: BunBuildOptions, b: anytype) *std.build.OptionsStep", + "pub fn sgemv(\n order: Order,\n trans: Trans,\n m: usize,\n n: usize,\n alpha: f32,\n a: []const f32,\n lda: usize,\n x: []const f32,\n x_add: usize,\n beta: f32,\n y: []f32,\n y_add: usize,\n) void" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__test_fsharp.fs.json b/tests/golden/legacy/components/tests__more_languages__group4__test_fsharp.fs.json new file mode 100644 index 0000000..38162e6 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__test_fsharp.fs.json @@ -0,0 +1,11 @@ +{ + "path": "tests/more_languages/group4/test_fsharp.fs", + "components": [ + "module TestFSharp", + "type Person = {", + "let add x y =", + "let multiply \n (x: int) \n (y: int): int =", + "let complexFunction\n (a: int)\n (b: string)\n (c: float)\n : (int * string) option =", + "type Result<'T> =" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__test_tcl_tk.tcl.json b/tests/golden/legacy/components/tests__more_languages__group4__test_tcl_tk.tcl.json new file mode 100644 index 0000000..a28d142 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__test_tcl_tk.tcl.json @@ -0,0 +1,8 @@ +{ + "path": "tests/more_languages/group4/test_tcl_tk.tcl", + "components": [ + "proc sayHello {}", + "proc arrg { input }", + "proc multiLine {\n x,\n y\n}" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group4__tf_test.tf.json b/tests/golden/legacy/components/tests__more_languages__group4__tf_test.tf.json new file mode 100644 index 0000000..b94f569 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group4__tf_test.tf.json @@ -0,0 +1,12 @@ +{ + "path": "tests/more_languages/group4/tf_test.tf", + "components": [ + "provider \"aws\"", + "resource \"aws_instance\" \"example\"", + "data \"aws_ami\" \"ubuntu\"", + "variable \"instance_type\"", + "output \"instance_public_ip\"", + "locals", + "module \"vpc\"" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__Makefile.json b/tests/golden/legacy/components/tests__more_languages__group5__Makefile.json new file mode 100644 index 0000000..37c94b5 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__Makefile.json @@ -0,0 +1,14 @@ +{ + "path": "tests/more_languages/group5/Makefile", + "components": [ + "include dotenv/dev.env", + ".PHONY: dev", + "dev", + "services-down", + "services-stop: services-down", + "define CHECK_POSTGRES", + "damage-report", + "tail-logs", + "cloud" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__ansible_test.yml.json b/tests/golden/legacy/components/tests__more_languages__group5__ansible_test.yml.json new file mode 100644 index 0000000..128fb5d --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__ansible_test.yml.json @@ -0,0 +1,8 @@ +{ + "path": "tests/more_languages/group5/ansible_test.yml", + "components": [ + "Install package", + "Start service", + "Create user" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__app-routing.module.ts.json b/tests/golden/legacy/components/tests__more_languages__group5__app-routing.module.ts.json new file mode 100644 index 0000000..474f900 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__app-routing.module.ts.json @@ -0,0 +1,7 @@ +{ + "path": "tests/more_languages/group5/app-routing.module.ts", + "components": [ + "const routes: Routes = [\n { path: '', redirectTo: 'login', pathMatch: 'full' },\n { path: '*', redirectTo: 'login' },\n { path: 'home', component: HomeComponent },\n { path: 'login', component: LoginComponent },\n { path: 'register', component: RegisterComponent },\n { path: 'events', component: EventsComponent },\n { path: 'invites', component: InvitesComponent },\n { path: 'rewards', component: RewardsComponent },\n { path: 'profile', component: ProfileComponent },\n];", + "export class AppRoutingModule" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__app.component.spec.ts.json b/tests/golden/legacy/components/tests__more_languages__group5__app.component.spec.ts.json new file mode 100644 index 0000000..06bade1 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__app.component.spec.ts.json @@ -0,0 +1,10 @@ +{ + "path": "tests/more_languages/group5/app.component.spec.ts", + "components": [ + "describe 'AppComponent'", + " it should create the app", + " it should welcome the user", + " it should welcome 'Jimbo'", + " it should request login if not logged in" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__app.component.ts.json b/tests/golden/legacy/components/tests__more_languages__group5__app.component.ts.json new file mode 100644 index 0000000..59d0318 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__app.component.ts.json @@ -0,0 +1,11 @@ +{ + "path": "tests/more_languages/group5/app.component.ts", + "components": [ + "export class AppComponent", + " constructor(\n private http: HttpClient,\n private loginService: LoginService,\n private stripeService: StripeService\n )", + " constructor(private loginService: LoginService)", + " checkSession()", + " async goToEvent(event_id: string)", + " valInvitedBy(event: any, event_id: string)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__app.module.ts.json b/tests/golden/legacy/components/tests__more_languages__group5__app.module.ts.json new file mode 100644 index 0000000..1395c5b --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__app.module.ts.json @@ -0,0 +1,7 @@ +{ + "path": "tests/more_languages/group5/app.module.ts", + "components": [ + "@NgModule({\n declarations: [\n AppComponent,\n HomeComponent,\n LoginComponent,\n RegisterComponent,\n EventsComponent,\n InvitesComponent,\n RewardsComponent,\n ProfileComponent", + "export class AppModule" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__checkbox_test.md.json b/tests/golden/legacy/components/tests__more_languages__group5__checkbox_test.md.json new file mode 100644 index 0000000..b8bd60f --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__checkbox_test.md.json @@ -0,0 +1,23 @@ +{ + "path": "tests/more_languages/group5/checkbox_test.md", + "components": [ + "# My Checkbox Test", + "## My No Parens Test", + "## My Empty href Test", + "## My other url Test [Q&A]", + "## My other other url Test [Q&A]", + "## My 2nd other url Test [Q&A]", + "## My 3rd other url Test [Q&A]", + "- [ ] Task 1", + " - [ ] No Space Task 1.1", + " - [ ] Two Spaces Task 1.2", + " - [ ] Subtask 1.2.1", + "- [ ] Task 2", + "- [x] Task 3", + " - [ ] Subtask 3.1", + "- [x] Task 6", + " - [x] Subtask 6.1", + " - [ ] Handle edge cases", + "# My Codeblock Test" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__checkbox_test.txt.json b/tests/golden/legacy/components/tests__more_languages__group5__checkbox_test.txt.json new file mode 100644 index 0000000..6ac8e8f --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__checkbox_test.txt.json @@ -0,0 +1,13 @@ +{ + "path": "tests/more_languages/group5/checkbox_test.txt", + "components": [ + "- [ ] fix phone number format +1", + "- [ ] add forgot password", + "- [ ] ? add email verification", + "- [ ] store token the right way", + "- [ ] test nesting of checkboxes", + "- [ ] user can use option to buy ticket at 2-referred price", + "- [ ] CTA refer 2 people to get instant lower price", + "- [ ] form to send referrals" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__environment.test.ts.json b/tests/golden/legacy/components/tests__more_languages__group5__environment.test.ts.json new file mode 100644 index 0000000..aee223d --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__environment.test.ts.json @@ -0,0 +1,10 @@ +{ + "path": "tests/more_languages/group5/environment.test.ts", + "components": [ + "environment:", + " production", + " cognitoUserPoolId", + " cognitoAppClientId", + " apiurl" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__hello_world.pyi.json b/tests/golden/legacy/components/tests__more_languages__group5__hello_world.pyi.json new file mode 100644 index 0000000..95f66fa --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__hello_world.pyi.json @@ -0,0 +1,7 @@ +{ + "path": "tests/more_languages/group5/hello_world.pyi", + "components": [ + "@final\nclass dtype(Generic[_DTypeScalar_co])", + " names: None | tuple[builtins.str, ...]" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__k8s_test.yaml.json b/tests/golden/legacy/components/tests__more_languages__group5__k8s_test.yaml.json new file mode 100644 index 0000000..e915f13 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__k8s_test.yaml.json @@ -0,0 +1,8 @@ +{ + "path": "tests/more_languages/group5/k8s_test.yaml", + "components": [ + "apps/v1.Deployment -> my-app", + "v1.Service -> my-service", + "v1.ConfigMap -> my-config" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__requirements_test.txt.json b/tests/golden/legacy/components/tests__more_languages__group5__requirements_test.txt.json new file mode 100644 index 0000000..4074c24 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__requirements_test.txt.json @@ -0,0 +1,14 @@ +{ + "path": "tests/more_languages/group5/requirements_test.txt", + "components": [ + "psycopg2-binary", + "pytest", + "coverage", + "flask[async]", + "flask_cors", + "stripe", + "pyjwt[crypto]", + "cognitojwt[async]", + "flask-lambda" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__rust_todo_test.rs.json b/tests/golden/legacy/components/tests__more_languages__group5__rust_todo_test.rs.json new file mode 100644 index 0000000..c31d09e --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__rust_todo_test.rs.json @@ -0,0 +1,13 @@ +{ + "path": "tests/more_languages/group5/rust_todo_test.rs", + "components": [ + "TODO: This todo tests parse_todo", + "enum Color {\n Red,\n Blue,\n Green,\n}", + "struct Point", + "trait Drawable", + " fn draw(&self)", + "impl Drawable for Point", + " fn draw(&self)", + "fn main()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__sql_test.sql.json b/tests/golden/legacy/components/tests__more_languages__group5__sql_test.sql.json new file mode 100644 index 0000000..20c37cd --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__sql_test.sql.json @@ -0,0 +1,27 @@ +{ + "path": "tests/more_languages/group5/sql_test.sql", + "components": [ + "CREATE TABLE promoters", + " user_id serial PRIMARY KEY,", + " type varchar(20) NOT NULL,", + " username varchar(20) NOT NULL,", + " password varchar(20) NOT NULL,", + " email varchar(30) NOT NULL,", + " phone varchar(20) NOT NULL,", + " promocode varchar(20),", + " info json,", + " going text[],", + " invites text[],", + " balance integer NOT NULL,", + " rewards text[],", + " created timestamp", + "CREATE TABLE events", + " event_id serial PRIMARY KEY,", + " name varchar(64) NOT NULL,", + " date varchar(64) NOT NULL,", + " location varchar(64) NOT NULL,", + " performer varchar(64) NOT NULL,", + " rewards json,", + " created timestamp" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__standard-app-routing.module.ts.json b/tests/golden/legacy/components/tests__more_languages__group5__standard-app-routing.module.ts.json new file mode 100644 index 0000000..9787404 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__standard-app-routing.module.ts.json @@ -0,0 +1,6 @@ +{ + "path": "tests/more_languages/group5/standard-app-routing.module.ts", + "components": [ + "const routes: Routes = [\n { path: '', component: HomeComponent },\n {\n path: 'heroes',\n component: HeroesListComponent,\n children: [\n { path: ':id', component: HeroDetailComponent },\n { path: 'new', component: HeroFormComponent },\n ],\n },\n { path: '**', component: PageNotFoundComponent },\n];" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__test.env.json b/tests/golden/legacy/components/tests__more_languages__group5__test.env.json new file mode 100644 index 0000000..76e2d01 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__test.env.json @@ -0,0 +1,25 @@ +{ + "path": "tests/more_languages/group5/test.env", + "components": [ + "PROMO_PATH", + "PRODUCTION", + "SQL_SCHEMA_PATH", + "DB_LOGS", + "DB_LOG", + "PGPASSWORD", + "PGDATABASE", + "PGHOST", + "PGPORT", + "PGUSER", + "SERVER_LOG", + "SERVER_LOGS", + "API_URL", + "APP_LOGS", + "APP_LOG", + "APP_URL", + "COGNITO_USER_POOL_ID", + "COGNITO_APP_CLIENT_ID", + "AWS_REGION", + "STRIPE_SECRET_KEY" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__testJsonSchema.json.json b/tests/golden/legacy/components/tests__more_languages__group5__testJsonSchema.json.json new file mode 100644 index 0000000..caa85e4 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__testJsonSchema.json.json @@ -0,0 +1,9 @@ +{ + "path": "tests/more_languages/group5/testJsonSchema.json", + "components": [ + "$schema: http://json-schema.org/draft-07/schema#", + "type: object", + "title: random_test", + "description: A promoter's activites related to events" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__testPackage.json.json b/tests/golden/legacy/components/tests__more_languages__group5__testPackage.json.json new file mode 100644 index 0000000..7e17045 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__testPackage.json.json @@ -0,0 +1,13 @@ +{ + "path": "tests/more_languages/group5/testPackage.json", + "components": [ + "name: 'promo-app'", + "version: 0.0.0", + "scripts:", + " ng: 'ng'", + " start: 'ng serve'", + " build: 'ng build'", + " watch: 'ng build --watch --configuration development'", + " test: 'ng test'" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group5__tickets.component.ts.json b/tests/golden/legacy/components/tests__more_languages__group5__tickets.component.ts.json new file mode 100644 index 0000000..cda6526 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group5__tickets.component.ts.json @@ -0,0 +1,54 @@ +{ + "path": "tests/more_languages/group5/tickets.component.ts", + "components": [ + "interface EnrichedTicket extends Ticket", + "interface SpinConfig", + "interface RotationState", + "interface SpeakInput", + "const formatSpeakInput = (input: SpeakInput): string =>", + "function hourToSpeech(hour: number, minute: number, period: string): string", + "export class TicketsComponent implements AfterViewInit", + " speak(input: SpeakInput)", + " speakEvent(ticket: EnrichedTicket): void", + " formatEvent(ticket: EnrichedTicket): string", + " speakVenue(ticket: EnrichedTicket): void", + " formatDate(date: Date, oneLiner: boolean = false): string", + " formatDateForSpeech(date: Date): string", + " async spinQRCode(\n event: PointerEvent,\n config: SpinConfig = DEFAULT_SPIN_CONFIG\n )", + " private animateRotation(\n imgElement: HTMLElement,\n targetRotation: number,\n config: SpinConfig,\n cleanup: () => void\n )", + " const animate = (currentTime: number) =>", + " requestAnimationFrame(animate)", + " cleanup()", + " requestAnimationFrame(animate)", + " private getNext90Degree(currentRotation: number): number", + " private getCurrentRotation(matrix: string): number", + " ngAfterViewInit()", + " const mouseEnterListener = () =>", + " const mouseLeaveListener = () =>", + " ngOnDestroy()", + " toggleColumn(event: MatOptionSelectionChange, column: string)", + " adjustColumns(event?: Event)", + " onResize(event: Event)", + " async ngOnInit()", + " async loadTickets(): Promise", + " onDateRangeChange(\n type: \"start\" | \"end\",\n event: MatDatepickerInputEvent\n )", + " applyFilter(column: string): void", + " formatDateForComparison(date: Date): string", + " constructor(private renderer: Renderer2)", + " onFilterChange(event: Event, column: string)", + " onLatitudeChange(event: Event)", + " onLongitudeChange(event: Event)", + " onRadiusChange(event: Event)", + " sortData(sort: Sort): void", + " onRowClick(event: Event, row: any)", + "function isDate(value: Date | undefined | null): value is Date", + "function isNonNullNumber(value: number | null): value is number", + "function hasLocation(\n ticket: any\n): ticket is", + "const create_faker_ticket = async () =>", + "function compare(a: number | string, b: number | string, isAsc: boolean)", + "function compare_dates(a: Date, b: Date, isAsc: boolean)", + "async function mockMoreTickets(): Promise", + "const mockTickets = async () =>", + "const renderQRCode = async (text: String): Promise =>" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__Microsoft.PowerShell_profile.ps1.json b/tests/golden/legacy/components/tests__more_languages__group6__Microsoft.PowerShell_profile.ps1.json new file mode 100644 index 0000000..995b401 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__Microsoft.PowerShell_profile.ps1.json @@ -0,0 +1,35 @@ +{ + "path": "tests/more_languages/group6/Microsoft.PowerShell_profile.ps1", + "components": [ + "function Log($message)", + "function Remove-ChocolateyFromPath", + "function Show-Profiles", + "function Show-Path", + "function Show-Error($err)", + "function Get-ScoopPackagePath", + "\tparam(\n\t\t[Parameter(Mandatory = $true)]\n\t\t[string]$PackageName)", + "function Check-Command", + "\tparam(\n\t\t[Parameter(Mandatory = $true)]\n\t\t[string]$Name)", + "function Add-ToPath", + "\tparam(\n\t\t[Parameter(Mandatory = $true)]\n\t\t[string]$PathToAdd)", + "function Install-Scoop", + "function Scoop-Install", + "\tparam(\n\t\t[Parameter(Mandatory = $true)]\n\t\t[string]$Name,\n\t\t[string]$PathToAdd)", + "function Start-CondaEnv", + "function Install-PipPackage", + "\tparam(\n [Parameter(Mandatory = $true)]\n\t\t[string]$PackageName)", + "function Install-VSBuildTools", + "function Install-Crate", + "\tparam(\n [Parameter(Mandatory = $true)]\n\t\t[string]$CrateName)", + "function Get-ScoopVersion", + "function Get-Version", + " param(\n [Parameter(Mandatory = $true)]\n [string]$ExecutablePath,\n [string]$ExecutableName)", + "function Show-Requirements", + "\tfunction Measure-Status", + "\t\tparam(\n\t\t\t[Parameter(Mandatory = $true)]\n\t\t\t[string]$Name)", + "function Find-Profile", + "function Edit-Profile", + "function Set-Profile", + "function Show-Profile" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__catastrophic.c.json b/tests/golden/legacy/components/tests__more_languages__group6__catastrophic.c.json new file mode 100644 index 0000000..a9305a5 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__catastrophic.c.json @@ -0,0 +1,196 @@ +{ + "path": "tests/more_languages/group6/catastrophic.c", + "components": [ + "TODO: technically we should use a proper parser", + "struct Point", + " int x;", + " int y;", + "struct Point getOrigin()", + "float mul_two_floats(float x1, float x2)", + "enum days", + " SUN,", + " MON,", + " TUE,", + " WED,", + " THU,", + " FRI,", + " SAT", + "enum worker_pool_flags", + " POOL_BH = 1 << 0,", + " POOL_MANAGER_ACTIVE = 1 << 1,", + " POOL_DISASSOCIATED = 1 << 2,", + " POOL_BH_DRAINING = 1 << 3,", + "enum worker_flags", + " WORKER_DIE = 1 << 1,", + " WORKER_IDLE = 1 << 2,", + " WORKER_PREP = 1 << 3,", + " WORKER_CPU_INTENSIVE = 1 << 6,", + " WORKER_UNBOUND = 1 << 7,", + " WORKER_REBOUND = 1 << 8,", + " WORKER_NOT_RUNNING = WORKER_PREP | WORKER_CPU_INTENSIVE |\n WORKER_UNBOUND | WORKER_REBOUND,", + "struct worker_pool", + " raw_spinlock_t lock;", + " int cpu;", + " int node;", + " int id;", + " unsigned int flags;", + " unsigned long watchdog_ts;", + " bool cpu_stall;", + " int nr_running;", + " struct list_head worklist;", + " int nr_workers;", + " int nr_idle;", + " struct list_head idle_list;", + " struct timer_list idle_timer;", + " struct work_struct idle_cull_work;", + " struct timer_list mayday_timer;", + " struct worker *manager;", + " struct list_head workers;", + " struct ida worker_ida;", + " struct workqueue_attrs *attrs;", + " struct hlist_node hash_node;", + " int refcnt;", + " struct rcu_head rcu;", + "long add_two_longs(long x1, long x2)", + "double multiplyByTwo(double num)", + "char getFirstCharacter(char *str)", + "void greet(Person p)", + "typedef struct", + " char name[50];", + "} Person;", + "typedef struct PersonA", + " char name[50];", + "} PersonB;", + "int main()", + "int* getArrayStart(int arr[], int size)", + "long complexFunctionWithMultipleArguments(\n int param1,\n double param2,\n char *param3,\n struct Point point\n)", + "keyPattern *ACLKeyPatternCreate(sds pattern, int flags)", + "sds sdsCatPatternString(sds base, keyPattern *pat)", + "static int ACLCheckChannelAgainstList(list *reference, const char *channel, int channellen, int is_pattern)", + " while((ln = listNext(&li)))", + "static struct config", + " aeEventLoop *el;", + " cliConnInfo conn_info;", + " const char *hostsocket;", + " int tls;", + " struct cliSSLconfig sslconfig;", + "} config;", + "class Person", + " std::string name;", + "public:", + " Person(std::string n) : name(n)", + " void greet()", + "void globalGreet()", + "int main()", + "void printMessage(const std::string &message)", + "template\nvoid printVector(const std::vector& vec)", + "struct foo", + " char x;", + " struct foo_in", + " char* y;", + " short z;", + " } inner;", + "struct Point", + " int x, y;", + " Point(int x, int y) : x(x), y(y)", + "class Animal", + " public:", + " Animal(const std::string &name) : name(name)", + " virtual void speak() const", + " virtual ~Animal()", + "protected:", + " std::string name;", + "class Dog : public Animal", + " public:", + " Dog(const std::string &name) : Animal(name)", + " void speak() const override", + "class Cat : public Animal", + " public:", + " Cat(const std::string &name) : Animal(name)", + " void speak() const override", + "class CatDog: public Animal, public Cat, public Dog", + " public:", + " CatDog(const std::string &name) : Animal(name)", + " int meow_bark()", + "nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int num_layers,\n int batch_size, int max_seq_length, float dropout,\n bool bidirectional, bool cudnn_allow_tf32,\n\t\t\t int workspace_size, int reserve_space_size)", + "int main()", + "enum ECarTypes", + " Sedan,", + " Hatchback,", + " SUV,", + " Wagon", + "ECarTypes GetPreferredCarType()", + "enum ECarTypes : uint8_t", + " Sedan,", + " Hatchback,", + " SUV = 254,", + " Hybrid", + "enum class ECarTypes : uint8_t", + " Sedan,", + " Hatchback,", + " SUV = 254,", + " Hybrid", + "void myFunction(string fname, int age)", + "template T cos(T)", + "template T sin(T)", + "template T sqrt(T)", + "template struct VLEN", + "template class arr", + " private:", + " static T *ralloc(size_t num)", + " static void dealloc(T *ptr)", + " static T *ralloc(size_t num)", + " static void dealloc(T *ptr)", + " public:", + " arr() : p(0), sz(0)", + " arr(size_t n) : p(ralloc(n)), sz(n)", + " arr(arr &&other)\n : p(other.p), sz(other.sz)", + " ~arr()", + " void resize(size_t n)", + " T &operator[](size_t idx)", + " T *data()", + " size_t size() const", + "class Buffer", + " private:", + " void* ptr_;", + "std::tuple quantize(\n const array& w,\n int group_size,\n int bits,\n StreamOrDevice s)", + "#define PY_SSIZE_T_CLEAN", + "#define PLATFORM_IS_X86", + "#define PLATFORM_WINDOWS", + "#define GETCPUID(a, b, c, d, a_inp, c_inp)", + "static int GetXCR0EAX()", + "#define GETCPUID(a, b, c, d, a_inp, c_inp)", + "static int GetXCR0EAX()", + " asm(\"XGETBV\" : \"=a\"(eax), \"=d\"(edx) : \"c\"(0))", + "static void ReportMissingCpuFeature(const char* name)", + "static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args)", + "static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args)", + "static PyMethodDef cpu_feature_guard_methods[]", + "static struct PyModuleDef cpu_feature_guard_module", + "#define EXPORT_SYMBOL __declspec(dllexport)", + "#define EXPORT_SYMBOL __attribute__ ((visibility(\"default\")))", + "EXPORT_SYMBOL PyMODINIT_FUNC PyInit_cpu_feature_guard(void)", + "typedef struct", + " GPT2Config config;", + " ParameterTensors params;", + " size_t param_sizes[NUM_PARAMETER_TENSORS];", + " float* params_memory;", + " size_t num_parameters;", + " ParameterTensors grads;", + " float* grads_memory;", + " float* m_memory;", + " float* v_memory;", + " ActivationTensors acts;", + " size_t act_sizes[NUM_ACTIVATION_TENSORS];", + " float* acts_memory;", + " size_t num_activations;", + " ActivationTensors grads_acts;", + " float* grads_acts_memory;", + " int batch_size;", + " int seq_len;", + " int* inputs;", + " int* targets;", + " float mean_loss;", + "} GPT2;" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.cc.json b/tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.cc.json new file mode 100644 index 0000000..e1c7a7f --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.cc.json @@ -0,0 +1,7 @@ +{ + "path": "tests/more_languages/group6/cpp_examples_impl.cc", + "components": [ + "PYBIND11_MODULE(cpp_examples, m)", + " m.def(\"add\", &add, \"An example function to add two numbers.\")" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.cu.json b/tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.cu.json new file mode 100644 index 0000000..9c58fdb --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.cu.json @@ -0,0 +1,7 @@ +{ + "path": "tests/more_languages/group6/cpp_examples_impl.cu", + "components": [ + "template \nT add(T a, T b)", + "template <>\nint add(int a, int b)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.h.json b/tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.h.json new file mode 100644 index 0000000..49c3962 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__cpp_examples_impl.h.json @@ -0,0 +1,7 @@ +{ + "path": "tests/more_languages/group6/cpp_examples_impl.h", + "components": [ + "template \nT add(T a, T b)", + "template <>\nint add(int, int)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__edge_case.hpp.json b/tests/golden/legacy/components/tests__more_languages__group6__edge_case.hpp.json new file mode 100644 index 0000000..077ea00 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__edge_case.hpp.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group6/edge_case.hpp", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__fractal.thy.json b/tests/golden/legacy/components/tests__more_languages__group6__fractal.thy.json new file mode 100644 index 0000000..9b9a4e9 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__fractal.thy.json @@ -0,0 +1,30 @@ +{ + "path": "tests/more_languages/group6/fractal.thy", + "components": [ + "Title: fractal.thy", + "Author: Isabelle/HOL Contributors!", + "Author: edge cases r us", + "theory Simplified_Ring", + "section ‹Basic Algebraic Structures›", + "class everything = nothing + itself", + "subsection ‹Monoids›", + "definition ring_hom :: \"[('a, 'm) ring_scheme, ('b, 'n) ring_scheme] => ('a => 'b) set\"", + "fun example_fun :: \"nat ⇒ nat\"", + "locale monoid =\n fixes G (structure)\n assumes m_closed: \"⟦x ∈ carrier G; y ∈ carrier G⟧ ⟹ x ⊗ y ∈ carrier G\"\n and m_assoc: \"⟦x ∈ carrier G; y ∈ carrier G; z ∈ carrier G⟧ ⟹ (x ⊗ y) ⊗ z = x ⊗ (y ⊗ z)\"\n and one_closed: \"𝟭 ∈ carrier G\"\n and l_one: \"x ∈ carrier G ⟹ 𝟭 ⊗ x = x\"\n and r_one: \"x ∈ carrier G ⟹ x ⊗ 𝟭 = x\"", + "subsection ‹Groups›", + "locale group = monoid +\n assumes Units_closed: \"x ∈ Units G ⟹ x ∈ carrier G\"\n and l_inv_ex: \"x ∈ carrier G ⟹ ∃ y ∈ carrier G. y ⊗ x = 𝟭\"\n and r_inv_ex: \"x ∈ carrier G ⟹ ∃ y ∈ carrier G. x ⊗ y = 𝟭\"", + "subsection ‹Rings›", + "locale ring = abelian_group R + monoid R +\n assumes l_distr: \"⟦x ∈ carrier R; y ∈ carrier R; z ∈ carrier R⟧ ⟹ (x ⊕ y) ⊗ z = x ⊗ z ⊕ y ⊗ z\"\n and r_distr: \"⟦x ∈ carrier R; y ∈ carrier R; z ∈ carrier R⟧ ⟹ z ⊗ (x ⊕ y) = z ⊗ x ⊕ z ⊗ y\"", + "locale commutative_ring = ring +\n assumes m_commutative: \"⟦x ∈ carrier R; y ∈ carrier R⟧ ⟹ x ⊗ y = y ⊗ x\"", + "locale domain = commutative_ring +\n assumes no_zero_divisors: \"⟦a ⊗ b = 𝟬; a ∈ carrier R; b ∈ carrier R⟧ ⟹ a = 𝟬 ∨ b = 𝟬\"", + "locale field = domain +\n assumes inv_ex: \"x ∈ carrier R - {𝟬} ⟹ inv x ∈ carrier R\"", + "subsection ‹Morphisms›", + "lemma example_lemma: \"example_fun n = n\"", + "qualified lemma gcd_0:\n \"gcd a 0 = normalize a\"", + "lemma abelian_monoidI:\n fixes R (structure)\n and f :: \"'edge::{} ⇒ 'case::{}\"\n assumes \"⋀x y. ⟦ x ∈ carrier R; y ∈ carrier R ⟧ ⟹ x ⊕ y ∈ carrier R\"\n and \"𝟬 ∈ carrier R\"\n and \"⋀x y z. ⟦ x ∈ carrier R; y ∈ carrier R; z ∈ carrier R ⟧ ⟹ (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z)\"\n shows \"abelian_monoid R\"", + "lemma euclidean_size_gcd_le1 [simp]:\n assumes \"a ≠ 0\"\n shows \"euclidean_size (gcd a b) ≤ euclidean_size a\"", + "theorem Residue_theorem:\n fixes S pts::\"complex set\" and f::\"complex ⇒ complex\"\n and g::\"real ⇒ complex\"\n assumes \"open S\" \"connected S\" \"finite pts\" and\n holo:\"f holomorphic_on S-pts\" and\n \"valid_path g\" and\n loop:\"pathfinish g = pathstart g\" and\n \"path_image g ⊆ S-pts\" and\n homo:\"∀z. (z ∉ S) ⟶ winding_number g z = 0\"\n shows \"contour_integral g f = 2 * pi * 𝗂 *(∑p ∈ pts. winding_number g p * residue f p)\"", + "corollary fps_coeff_residues_bigo':\n fixes f :: \"complex ⇒ complex\" and r :: real\n assumes exp: \"f has_fps_expansion F\"\n assumes \"open A\" \"connected A\" \"cball 0 r ⊆ A\" \"r > 0\" \n assumes \"f holomorphic_on A - S\" \"S ⊆ ball 0 r\" \"finite S\" \"0 ∉ S\"\n assumes \"eventually (λn. g n = -(∑z ∈ S. residue (λz. f z / z ^ Suc n) z)) sequentially\"\n (is \"eventually (λn. _ = -?g' n) _\")\n shows \"(λn. fps_nth F n - g n) ∈ O(λn. 1 / r ^ n)\" (is \"(λn. ?c n - _) ∈ O(_)\")", + "end" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__python_complex_class.py.json b/tests/golden/legacy/components/tests__more_languages__group6__python_complex_class.py.json new file mode 100644 index 0000000..50da2b1 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__python_complex_class.py.json @@ -0,0 +1,6 @@ +{ + "path": "tests/more_languages/group6/python_complex_class.py", + "components": [ + "class Box(Space[NDArray[Any]])" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__ramda__cloneRegExp.js.json b/tests/golden/legacy/components/tests__more_languages__group6__ramda__cloneRegExp.js.json new file mode 100644 index 0000000..b5ece94 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__ramda__cloneRegExp.js.json @@ -0,0 +1,6 @@ +{ + "path": "tests/more_languages/group6/ramda__cloneRegExp.js", + "components": [ + "export default function _cloneRegExp(pattern)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__ramda_prop.js.json b/tests/golden/legacy/components/tests__more_languages__group6__ramda_prop.js.json new file mode 100644 index 0000000..3dffe56 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__ramda_prop.js.json @@ -0,0 +1,9 @@ +{ + "path": "tests/more_languages/group6/ramda_prop.js", + "components": [ + "/**\n * Returns a function that when supplied an object returns the indicated\n * property of that object, if it exists.\n * @category Object\n * @typedefn Idx = String | Int | Symbol\n * @sig Idx -> {s: a} -> a | Undefined\n * @param {String|Number} p The property name or array index\n * @param {Object} obj The object to query\n * @return {*} The value at `obj.p`.\n */\nvar prop = _curry2(function prop(p, obj)", + "/**\n * Solves equations of the form a * x = b\n * @param {{\n * z: number\n * }} x\n */\nfunction foo(x)", + "/**\n * Deconstructs an array field from the input documents to output a document for each element.\n * Each output document is the input document with the value of the array field replaced by the element.\n * @category Object\n * @sig String -> {k: [v]} -> [{k: v}]\n * @param {String} key The key to determine which property of the object should be unwound.\n * @param {Object} object The object containing the list to unwind at the property named by the key.\n * @return {List} A list of new objects, each having the given key associated to an item from the unwound list.\n */\nvar unwind = _curry2(function(key, object)", + " return _map(function(item)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__tensorflow_flags.h.json b/tests/golden/legacy/components/tests__more_languages__group6__tensorflow_flags.h.json new file mode 100644 index 0000000..3e56f70 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__tensorflow_flags.h.json @@ -0,0 +1,102 @@ +{ + "path": "tests/more_languages/group6/tensorflow_flags.h", + "components": [ + "TF_DECLARE_FLAG('test_only_experiment_1')", + "TF_DECLARE_FLAG('test_only_experiment_2')", + "TF_DECLARE_FLAG('enable_nested_function_shape_inference'):\n\tAllow ops such as tf.cond to invoke the ShapeRefiner on their nested functions.", + "TF_DECLARE_FLAG('enable_quantized_dtypes_training'):\n\tSet quantized dtypes, like tf.qint8, to be trainable.", + "TF_DECLARE_FLAG('graph_building_optimization'):\n\tOptimize graph building for faster tf.function tracing.", + "TF_DECLARE_FLAG('saved_model_fingerprinting'):\n\tAdd fingerprint to SavedModels.", + "TF_DECLARE_FLAG('more_stack_traces'):\n\tEnable experimental code that preserves and propagates graph node stack traces in C++.", + "TF_DECLARE_FLAG('publish_function_graphs'):\n\tEnables the publication of partitioned function graphs via StatsPublisherInterface. Disabling this flag can reduce memory consumption.", + "TF_DECLARE_FLAG('enable_aggressive_constant_replication'):\n\tReplicate constants across CPU devices and even for local CPUs within the same task if available.", + "TF_DECLARE_FLAG('enable_colocation_key_propagation_in_while_op_lowering'):\n\tIf true, colocation key attributes for the ops will be propagated during while op lowering to switch/merge ops.", + "Flag('tf_xla_auto_jit'):\n\tControl compilation of operators into XLA computations on CPU and GPU devices. 0 = use ConfigProto setting; -1 = off; 1 = on for things very likely to be improved; 2 = on for everything; (experimental) fusible = only for Tensorflow operations that XLA knows how to fuse. If set to single-gpu() then this resolves to for single-GPU graphs (graphs that have at least one node placed on a GPU and no more than one GPU is in use through the entire graph) and 0 otherwise. Experimental.", + "Flag('tf_xla_min_cluster_size'):\n\tMinimum number of operators in an XLA compilation. Ignored for operators placed on an XLA device or operators explicitly marked for compilation.", + "Flag('tf_xla_max_cluster_size'):\n\tMaximum number of operators in an XLA compilation.", + "Flag('tf_xla_cluster_exclude_ops'):\n\t(experimental) Exclude the operations from auto-clustering. If multiple, separate them with commas. Where, Some_other_ops.", + "Flag('tf_xla_clustering_debug'):\n\tDump graphs during XLA compilation.", + "Flag('tf_xla_cpu_global_jit'):\n\tEnables global JIT compilation for CPU via SessionOptions.", + "Flag('tf_xla_clustering_fuel'):\n\tPlaces an artificial limit on the number of ops marked as eligible for clustering.", + "Flag('tf_xla_disable_deadness_safety_checks_for_debugging'):\n\tDisable deadness related safety checks when clustering (this is unsound).", + "Flag('tf_xla_disable_resource_variable_safety_checks_for_debugging'):\n\tDisable resource variables related safety checks when clustering (this is unsound).", + "Flag('tf_xla_deterministic_cluster_names'):\n\tCauses the function names assigned by auto clustering to be deterministic from run to run.", + "Flag('tf_xla_persistent_cache_directory'):\n\tIf non-empty, JIT-compiled executables are saved to and loaded from the specified file system directory path. Empty by default.", + "Flag('tf_xla_persistent_cache_device_types'):\n\tIf non-empty, the persistent cache will only be used for the specified devices (comma separated). Each device type should be able to be converted to.", + "Flag('tf_xla_persistent_cache_read_only'):\n\tIf true, the persistent cache will be read-only.", + "Flag('tf_xla_disable_strict_signature_checks'):\n\tIf true, entires loaded into the XLA compile cache will not have their signatures checked strictly. Defaults to false.", + "Flag('tf_xla_persistent_cache_prefix'):\n\tSpecifies the persistance cache prefix. Default is.", + "Flag('tf_xla_sparse_core_disable_table_stacking'):\n\tDisable table stacking for all the tables passed to the SparseCore mid level API.", + "Flag('tf_xla_sparse_core_minibatch_max_division_level'):\n\tMax level of division to split input data into minibatches.", + "Flag('tf_xla_sparse_core_stacking_mem_limit_bytes'):\n\tIf non-zero, limits the size of the activations for a given table to be below these many bytes.", + "Flag('tf_xla_sparse_core_stacking_table_shard_limit_bytes'):\n\tIf non-zero, limits the size of any table shard to be below these many bytes.", + "Flag('always_specialize')", + "Flag('cost_driven_async_parallel_for')", + "Flag('enable_crash_reproducer')", + "Flag('log_query_of_death')", + "Flag('vectorize')", + "Flag('tf_xla_enable_lazy_compilation')", + "Flag('tf_xla_print_cluster_outputs'):\n\tIf true then insert Print nodes to print out values produced by XLA clusters.", + "Flag('tf_xla_check_cluster_input_numerics'):\n\tIf true then insert CheckNumerics nodes to check all cluster inputs.", + "Flag('tf_xla_check_cluster_output_numerics'):\n\tIf true then insert CheckNumerics nodes to check all cluster outputs.", + "Flag('tf_xla_disable_constant_folding'):\n\tIf true then disables constant folding on TF graph before XLA compilation.", + "Flag('tf_xla_disable_full_embedding_pipelining'):\n\tIf true then disables full embedding pipelining and instead use strict SparseCore / TensorCore sequencing.", + "Flag('tf_xla_embedding_parallel_iterations'):\n\tIf >0 then use this many parallel iterations in embedding_pipelining and embedding_sequency. By default, use the parallel_iterations on the original model WhileOp.", + "Flag('tf_xla_compile_on_demand'):\n\tSwitch a device into 'on-demand' mode, where instead of autoclustering ops are compiled one by one just-in-time.", + "Flag('tf_xla_enable_xla_devices'):\n\tGenerate XLA_* devices, where placing a computation on such a device forces compilation by XLA. Deprecated.", + "Flag('tf_xla_always_defer_compilation')", + "Flag('tf_xla_async_compilation'):\n\tWhen lazy compilation is enabled, asynchronous compilation starts the cluster compilation in the background, and the fallback path is executed until the compilation has finished.", + "Flag('tf_xla_use_device_api_for_xla_launch'):\n\tIf true, uses Device API (PjRt) for single device compilation and execution of functions marked for JIT compilation i.e. jit_compile=True. Defaults to false.", + "Flag('tf_xla_use_device_api_for_compile_on_demand'):\n\tIf true, uses Device API (PjRt) for compiling and executing ops one by one in 'on-demand' mode. Defaults to false.", + "Flag('tf_xla_use_device_api_for_auto_jit'):\n\tIf true, uses Device API (PjRt) for compilation and execution when auto-clustering is enabled. Defaults to false.", + "Flag('tf_xla_use_device_api'):\n\tIf true, uses Device API (PjRt) for compilation and execution of ops one-by-one in 'on-demand' mode, for functions marked for JIT compilation, or when auto-clustering is enabled. Defaults to false.", + "Flag('tf_xla_enable_device_api_for_gpu'):\n\tIf true, uses Device API (PjRt) for TF GPU device. This is a helper flag so that individual tests can turn on PjRt for GPU specifically.", + "Flag('tf_xla_call_module_disabled_checks'):\n\tA comma-sepated list of directives specifying the safety checks to be skipped when compiling XlaCallModuleOp. See the op documentation for the recognized values.", + "Flag('tf_mlir_enable_mlir_bridge'):\n\tEnables experimental MLIR-Based TensorFlow Compiler Bridge.", + "Flag('tf_mlir_enable_merge_control_flow_pass'):\n\tEnables MergeControlFlow pass for MLIR-Based TensorFlow Compiler Bridge.", + "Flag('tf_mlir_enable_convert_control_to_data_outputs_pass'):\n\tEnables MLIR-Based TensorFlow Compiler Bridge.", + "Flag('tf_mlir_enable_strict_clusters'):\n\tDo not allow clusters that have cyclic control dependencies.", + "Flag('tf_mlir_enable_multiple_local_cpu_devices'):\n\tEnable multiple local CPU devices. CPU ops which are outside compiled inside the tpu cluster will also be replicated across multiple cpu devices.", + "Flag('tf_dump_graphs_in_tfg'):\n\tWhen tf_dump_graphs_in_tfg is true, graphs after transformations are dumped in MLIR TFG dialect and not in GraphDef.", + "Flag('tf_mlir_enable_generic_outside_compilation'):\n\tEnables OutsideCompilation passes for MLIR-Based TensorFlow Generic Compiler Bridge.", + "Flag('tf_mlir_enable_tpu_variable_runtime_reformatting_pass'):\n\tEnables TPUVariableRuntimeReformatting pass for MLIR-Based TensorFlow Compiler Bridge. This enables weight update sharding and creates TPUReshardVariables ops.", + "TF_PY_DECLARE_FLAG('test_only_experiment_1')", + "TF_PY_DECLARE_FLAG('test_only_experiment_2')", + "TF_PY_DECLARE_FLAG('enable_nested_function_shape_inference')", + "TF_PY_DECLARE_FLAG('enable_quantized_dtypes_training')", + "TF_PY_DECLARE_FLAG('graph_building_optimization')", + "TF_PY_DECLARE_FLAG('op_building_optimization')", + "TF_PY_DECLARE_FLAG('saved_model_fingerprinting')", + "TF_PY_DECLARE_FLAG('tf_shape_default_int64')", + "TF_PY_DECLARE_FLAG('more_stack_traces')", + "TF_PY_DECLARE_FLAG('publish_function_graphs')", + "TF_PY_DECLARE_FLAG('enable_aggressive_constant_replication')", + "TF_PY_DECLARE_FLAG('enable_colocation_key_propagation_in_while_op_lowering')", + "#define TENSORFLOW_CORE_CONFIG_FLAG_DEFS_H_", + "class Flags", + " public:", + "bool SetterForXlaAutoJitFlag(const string& value)", + "bool SetterForXlaCallModuleDisabledChecks(const string& value)", + "void AppendMarkForCompilationPassFlagsInternal(std::vector* flag_list)", + "void AllocateAndParseJitRtFlags()", + "void AllocateAndParseFlags()", + "void ResetFlags()", + "bool SetXlaAutoJitFlagFromFlagString(const string& value)", + "BuildXlaOpsPassFlags* GetBuildXlaOpsPassFlags()", + "MarkForCompilationPassFlags* GetMarkForCompilationPassFlags()", + "XlaSparseCoreFlags* GetXlaSparseCoreFlags()", + "XlaDeviceFlags* GetXlaDeviceFlags()", + "XlaOpsCommonFlags* GetXlaOpsCommonFlags()", + "XlaCallModuleFlags* GetXlaCallModuleFlags()", + "MlirCommonFlags* GetMlirCommonFlags()", + "void ResetJitCompilerFlags()", + "const JitRtFlags& GetJitRtFlags()", + "ConfigProto::Experimental::MlirBridgeRollout GetMlirBridgeRolloutState(\n std::optional config_proto)", + "void AppendMarkForCompilationPassFlags(std::vector* flag_list)", + "void DisableXlaCompilation()", + "void EnableXlaCompilation()", + "bool FailOnXlaCompilation()", + "#define TF_PY_DECLARE_FLAG(flag_name)", + "PYBIND11_MODULE(flags_pybind, m)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__test.f.json b/tests/golden/legacy/components/tests__more_languages__group6__test.f.json new file mode 100644 index 0000000..620d4d7 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__test.f.json @@ -0,0 +1,11 @@ +{ + "path": "tests/more_languages/group6/test.f", + "components": [ + "MODULE basic_mod", + " TYPE :: person\n CHARACTER(LEN=50) :: name\n INTEGER :: age\n END TYPE person", + " SUBROUTINE short_hello(happy, path)\n END SUBROUTINE short_hello", + " SUBROUTINE long_hello(\n p,\n message\n )\n END SUBROUTINE long_hello", + "END MODULE basic_mod", + "PROGRAM HelloFortran\nEND PROGRAM HelloFortran" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__torch.rst.json b/tests/golden/legacy/components/tests__more_languages__group6__torch.rst.json new file mode 100644 index 0000000..e1ddc0d --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__torch.rst.json @@ -0,0 +1,7 @@ +{ + "path": "tests/more_languages/group6/torch.rst", + "components": [ + "# libtorch (C++-only)", + "- Building libtorch using Python" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group6__yc.html.json b/tests/golden/legacy/components/tests__more_languages__group6__yc.html.json new file mode 100644 index 0000000..db98f04 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group6__yc.html.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group6/yc.html", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group7__absurdly_huge.jsonl.json b/tests/golden/legacy/components/tests__more_languages__group7__absurdly_huge.jsonl.json new file mode 100644 index 0000000..72267e5 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group7__absurdly_huge.jsonl.json @@ -0,0 +1,14 @@ +{ + "path": "tests/more_languages/group7/absurdly_huge.jsonl", + "components": [ + "SMILES: str", + "Yield: float", + "Temperature: int", + "Pressure: float", + "Solvent: str", + "Success: bool", + "Reaction_Conditions: dict", + "Products: list", + "EdgeCasesMissed: None" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group7__angular_crud.ts.json b/tests/golden/legacy/components/tests__more_languages__group7__angular_crud.ts.json new file mode 100644 index 0000000..f0009a9 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group7__angular_crud.ts.json @@ -0,0 +1,18 @@ +{ + "path": "tests/more_languages/group7/angular_crud.ts", + "components": [ + "interface DBCommand", + "export class IndexedDbService", + " constructor()", + " async create_connection({ db_name = 'client_db', table_name }: DBCommand)", + " upgrade(db)", + " async create_model({ db_name, table_name, model }: DBCommand)", + " verify_matching({ table_name, model })", + " async read_key({ db_name, table_name, key }: DBCommand)", + " async update_model({ db_name, table_name, model }: DBCommand)", + " verify_matching({ table_name, model })", + " async delete_key({ db_name, table_name, key }: DBCommand)", + " async list_table({\n db_name,\n table_name,\n where,\n }: DBCommand & { where?: { [key: string]: string | number } })", + " async search_table(criteria: SearchCriteria)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group7__structure.py.json b/tests/golden/legacy/components/tests__more_languages__group7__structure.py.json new file mode 100644 index 0000000..b5acd6c --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group7__structure.py.json @@ -0,0 +1,35 @@ +{ + "path": "tests/more_languages/group7/structure.py", + "components": [ + "@runtime_checkable\nclass DataClass(Protocol)", + " __dataclass_fields__: dict", + "class MyInteger(Enum)", + " ONE = 1", + " TWO = 2", + " THREE = 42", + "class MyString(Enum)", + " AAA1 = \"aaa\"", + " BB_B = \"\"\"edge\ncase\"\"\"", + "@dataclass(frozen=True, slots=True, kw_only=True)\nclass Tool", + " name: str", + " description: str", + " input_model: DataClass", + " output_model: DataClass", + " def execute(self, *args, **kwargs)", + " @property\n def edge_case(self) -> str", + " def should_still_see_me(self, x: bool = True) -> \"Tool\"", + "@dataclass\nclass MyInput[T]", + " name: str", + " rank: MyInteger", + " serial_n: int", + "@dataclass\nclass Thingy", + " is_edge_case: bool", + "@dataclass\nclass MyOutput", + " orders: str", + "class MyTools(Enum)", + " TOOL_A = Tool(\n name=\"complicated\",\n description=\"edge case!\",\n input_model=MyInput[Thingy],\n output_model=MyOutput,\n )", + " TOOL_B = Tool(\n name=\"\"\"super\ncomplicated\n\"\"\",\n description=\"edge case!\",\n input_model=MyInput,\n output_model=MyOutput,\n )", + "@final\nclass dtype(Generic[_DTypeScalar_co])", + " names: None | tuple[builtins.str, ...]" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group7__test.metal.json b/tests/golden/legacy/components/tests__more_languages__group7__test.metal.json new file mode 100644 index 0000000..a4b724d --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group7__test.metal.json @@ -0,0 +1,11 @@ +{ + "path": "tests/more_languages/group7/test.metal", + "components": [ + "struct MyData", + "kernel void myKernel(device MyData* data [[buffer(0)]],\n uint id [[thread_position_in_grid]])", + "float myHelperFunction(float x, float y)", + "vertex float4 vertexShader(const device packed_float3* vertex_array [[buffer(0)]],\n unsigned int vid [[vertex_id]])", + "fragment half4 fragmentShader(float4 P [[position]])", + "float3 computeNormalMap(ColorInOut in, texture2d normalMapTexture)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group7__test.wgsl.json b/tests/golden/legacy/components/tests__more_languages__group7__test.wgsl.json new file mode 100644 index 0000000..3e5ed6e --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group7__test.wgsl.json @@ -0,0 +1,20 @@ +{ + "path": "tests/more_languages/group7/test.wgsl", + "components": [ + "alias MyVec = vec4", + "alias AnotherVec = vec2", + "struct VertexInput", + "struct VertexOutput", + "struct MyUniforms", + "@group(0) @binding(0) var u_mvp: mat4x4", + "@group(0) @binding(1) var u_color: MyVec", + "@group(1) @binding(0) var my_texture: texture_2d", + "@group(1) @binding(1) var my_sampler: sampler", + "@vertex\nfn vs_main(in: VertexInput) -> VertexOutput", + "@fragment\nfn fs_main(in: VertexOutput) -> @location(0) vec4", + "@compute @workgroup_size(8, 8, 1)\nfn cs_main(@builtin(global_invocation_id) global_id: vec3)", + "fn helper_function(val: f32) -> f32", + "struct AnotherStruct", + "@compute\n@workgroup_size(8, 8, 1)\nfn multi_line_edge_case(\n @builtin(global_invocation_id)\n globalId : vec3,\n @group(1)\n @binding(0)\n srcTexture : texture_2d,\n @group(1)\n @binding(1)\n srcSampler : sampler,\n @group(0)\n @binding(0)\n uniformsPtr : ptr,\n storageBuffer : ptr, 64>, read_write>,\n)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_lisp__LispTest.lisp.json b/tests/golden/legacy/components/tests__more_languages__group_lisp__LispTest.lisp.json new file mode 100644 index 0000000..d0644e9 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_lisp__LispTest.lisp.json @@ -0,0 +1,7 @@ +{ + "path": "tests/more_languages/group_lisp/LispTest.lisp", + "components": [ + "defstruct person", + "defun greet" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_lisp__clojure_test.clj.json b/tests/golden/legacy/components/tests__more_languages__group_lisp__clojure_test.clj.json new file mode 100644 index 0000000..f804808 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_lisp__clojure_test.clj.json @@ -0,0 +1,13 @@ +{ + "path": "tests/more_languages/group_lisp/clojure_test.clj", + "components": [ + "defprotocol P", + "defrecord Person", + "defn -main", + "ns bion.likes_trees", + "def repo-url", + "defn config", + "defmacro with-os", + "defrecord SetFullElement" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_lisp__racket_struct.rkt.json b/tests/golden/legacy/components/tests__more_languages__group_lisp__racket_struct.rkt.json new file mode 100644 index 0000000..3b3ef8d --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_lisp__racket_struct.rkt.json @@ -0,0 +1,6 @@ +{ + "path": "tests/more_languages/group_lisp/racket_struct.rkt", + "components": [ + "struct point" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_lisp__test_scheme.scm.json b/tests/golden/legacy/components/tests__more_languages__group_lisp__test_scheme.scm.json new file mode 100644 index 0000000..444dfd3 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_lisp__test_scheme.scm.json @@ -0,0 +1,11 @@ +{ + "path": "tests/more_languages/group_lisp/test_scheme.scm", + "components": [ + "define topological-sort", + " define table", + " define queue", + " define result", + " define set-up", + " define traverse" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__AAPLShaders.metal.json b/tests/golden/legacy/components/tests__more_languages__group_todo__AAPLShaders.metal.json new file mode 100644 index 0000000..bfe72ca --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__AAPLShaders.metal.json @@ -0,0 +1,30 @@ +{ + "path": "tests/more_languages/group_todo/AAPLShaders.metal", + "components": [ + "struct LightingParameters", + "float Geometry(float Ndotv, float alphaG)", + "float3 computeNormalMap(ColorInOut in, texture2d normalMapTexture)", + "float3 computeDiffuse(LightingParameters parameters)", + "float Distribution(float NdotH, float roughness)", + "float3 computeSpecular(LightingParameters parameters)", + "float4 equirectangularSample(float3 direction, sampler s, texture2d image)", + "LightingParameters calculateParameters(ColorInOut in,\n AAPLCameraData cameraData,\n constant AAPLLightData& lightData,\n texture2d baseColorMap,\n texture2d normalMap,\n texture2d metallicMap,\n texture2d roughnessMap,\n texture2d ambientOcclusionMap,\n texture2d skydomeMap)", + "struct SkyboxVertex", + "struct SkyboxV2F", + "vertex SkyboxV2F skyboxVertex(SkyboxVertex in [[stage_in]],\n constant AAPLCameraData& cameraData [[buffer(BufferIndexCameraData)]])", + "fragment float4 skyboxFragment(SkyboxV2F v [[stage_in]], texture2d skytexture [[texture(0)]])", + "vertex ColorInOut vertexShader(Vertex in [[stage_in]],\n constant AAPLInstanceTransform& instanceTransform [[ buffer(BufferIndexInstanceTransforms) ]],\n constant AAPLCameraData& cameraData [[ buffer(BufferIndexCameraData) ]])", + "float2 calculateScreenCoord( float3 ndcpos )", + "fragment float4 fragmentShader(\n ColorInOut in [[stage_in]],\n constant AAPLCameraData& cameraData [[ buffer(BufferIndexCameraData) ]],\n constant AAPLLightData& lightData [[ buffer(BufferIndexLightData) ]],\n constant AAPLSubmeshKeypath&submeshKeypath [[ buffer(BufferIndexSubmeshKeypath)]],\n constant Scene* pScene [[ buffer(SceneIndex)]],\n texture2d skydomeMap [[ texture(AAPLSkyDomeTexture) ]],\n texture2d rtReflections [[ texture(AAPLTextureIndexReflections), function_constant(is_raytracing_enabled)]])", + "fragment float4 reflectionShader(ColorInOut in [[stage_in]],\n texture2d rtReflections [[texture(AAPLTextureIndexReflections)]])", + "struct ThinGBufferOut", + "fragment ThinGBufferOut gBufferFragmentShader(ColorInOut in [[stage_in]])", + "kernel void rtReflection(\n texture2d< float, access::write > outImage [[texture(OutImageIndex)]],\n texture2d< float > positions [[texture(ThinGBufferPositionIndex)]],\n texture2d< float > directions [[texture(ThinGBufferDirectionIndex)]],\n texture2d< float > skydomeMap [[texture(AAPLSkyDomeTexture)]],\n constant AAPLInstanceTransform* instanceTransforms [[buffer(BufferIndexInstanceTransforms)]],\n constant AAPLCameraData& cameraData [[buffer(BufferIndexCameraData)]],\n constant AAPLLightData& lightData [[buffer(BufferIndexLightData)]],\n constant Scene* pScene [[buffer(SceneIndex)]],\n instance_acceleration_structure accelerationStructure [[buffer(AccelerationStructureIndex)]],\n uint2 tid [[thread_position_in_grid]])", + "else if ( intersection.type == raytracing::intersection_type::none )", + "struct VertexInOut", + "vertex VertexInOut vertexPassthrough( uint vid [[vertex_id]] )", + "fragment float4 fragmentPassthrough( VertexInOut in [[stage_in]], texture2d< float > tin )", + "fragment float4 fragmentBloomThreshold( VertexInOut in [[stage_in]],\n texture2d< float > tin [[texture(0)]],\n constant float* threshold [[buffer(0)]] )", + "fragment float4 fragmentPostprocessMerge( VertexInOut in [[stage_in]],\n constant float& exposure [[buffer(0)]],\n texture2d< float > texture0 [[texture(0)]],\n texture2d< float > texture1 [[texture(1)]])" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__crystal_test.cr.json b/tests/golden/legacy/components/tests__more_languages__group_todo__crystal_test.cr.json new file mode 100644 index 0000000..8a5ab21 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__crystal_test.cr.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/crystal_test.cr", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__dart_test.dart.json b/tests/golden/legacy/components/tests__more_languages__group_todo__dart_test.dart.json new file mode 100644 index 0000000..6fdb845 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__dart_test.dart.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/dart_test.dart", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__elixir_test.exs.json b/tests/golden/legacy/components/tests__more_languages__group_todo__elixir_test.exs.json new file mode 100644 index 0000000..6de993c --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__elixir_test.exs.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/elixir_test.exs", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__forward.frag.json b/tests/golden/legacy/components/tests__more_languages__group_todo__forward.frag.json new file mode 100644 index 0000000..8494545 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__forward.frag.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/forward.frag", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__forward.vert.json b/tests/golden/legacy/components/tests__more_languages__group_todo__forward.vert.json new file mode 100644 index 0000000..7d485fa --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__forward.vert.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/forward.vert", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__nodemon.json.json b/tests/golden/legacy/components/tests__more_languages__group_todo__nodemon.json.json new file mode 100644 index 0000000..a566978 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__nodemon.json.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/nodemon.json", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__sas_test.sas.json b/tests/golden/legacy/components/tests__more_languages__group_todo__sas_test.sas.json new file mode 100644 index 0000000..68f236a --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__sas_test.sas.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/sas_test.sas", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__testTypings.d.ts.json b/tests/golden/legacy/components/tests__more_languages__group_todo__testTypings.d.ts.json new file mode 100644 index 0000000..bf27acc --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__testTypings.d.ts.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/testTypings.d.ts", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__test_setup_py.test.json b/tests/golden/legacy/components/tests__more_languages__group_todo__test_setup_py.test.json new file mode 100644 index 0000000..2ece290 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__test_setup_py.test.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/test_setup_py.test", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__vba_test.bas.json b/tests/golden/legacy/components/tests__more_languages__group_todo__vba_test.bas.json new file mode 100644 index 0000000..758031d --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__vba_test.bas.json @@ -0,0 +1,4 @@ +{ + "path": "tests/more_languages/group_todo/vba_test.bas", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__more_languages__group_todo__wgsl_test.wgsl.json b/tests/golden/legacy/components/tests__more_languages__group_todo__wgsl_test.wgsl.json new file mode 100644 index 0000000..91e72f5 --- /dev/null +++ b/tests/golden/legacy/components/tests__more_languages__group_todo__wgsl_test.wgsl.json @@ -0,0 +1,8 @@ +{ + "path": "tests/more_languages/group_todo/wgsl_test.wgsl", + "components": [ + "@binding(0) @group(0) var frame : u32", + "@vertex\nfn vtx_main(@builtin(vertex_index) vertex_index : u32) -> @builtin(position) vec4f", + "@fragment\nfn frag_main() -> @location(0) vec4f" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__path_to_test__class_method_type.py.json b/tests/golden/legacy/components/tests__path_to_test__class_method_type.py.json new file mode 100644 index 0000000..6faaf76 --- /dev/null +++ b/tests/golden/legacy/components/tests__path_to_test__class_method_type.py.json @@ -0,0 +1,33 @@ +{ + "path": "tests/path_to_test/class_method_type.py", + "components": [ + "T = TypeVar(\"T\")", + "def parse_py(contents: str) -> List[str]", + "class MyClass", + " @staticmethod\n def physical_element_aval(dtype) -> core.ShapedArray", + " def my_method(self)", + " @staticmethod\n def my_typed_method(obj: dict) -> int", + " def my_multiline_signature_method(\n self,\n alice: str = None,\n bob: int = None,\n ) -> tuple", + "@lru_cache(maxsize=None)\ndef my_multiline_signature_function(\n tree: tuple = (),\n plus: str = \"+\",\n) -> tuple", + "class LogLevelEnum(str, Enum)", + " CRITICAL = \"CRITICAL\"", + " GREETING = \"GREETING\"", + " WARNING = \"WARNING\"", + " ERROR = \"ERROR\"", + " DEBUG = \"DEBUG\"", + " INFO = \"INFO\"", + " OFF = \"OFF\"", + "class Thingy(BaseModel)", + " metric: float", + "@dataclass\nclass TestDataclass", + " tree: str", + "A = TypeVar(\"A\", str, bytes)", + "def omega_yikes(file: str, expected: List[str]) -> bool", + "def ice[T](args: Iterable[T] = ())", + "class list[T]", + " def __getitem__(self, index: int, /) -> T", + " @classmethod\n def from_code(cls, toolbox, code: bytes, score=None) -> \"Thingy\"", + " @classmethod\n def from_str(cls, toolbox, string: str, score=None) -> \"Thingy\"", + "class Router(hk.Module)" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__path_to_test__empty.py.json b/tests/golden/legacy/components/tests__path_to_test__empty.py.json new file mode 100644 index 0000000..c297b5f --- /dev/null +++ b/tests/golden/legacy/components/tests__path_to_test__empty.py.json @@ -0,0 +1,4 @@ +{ + "path": "tests/path_to_test/empty.py", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__path_to_test__file.md.json b/tests/golden/legacy/components/tests__path_to_test__file.md.json new file mode 100644 index 0000000..8da3d5f --- /dev/null +++ b/tests/golden/legacy/components/tests__path_to_test__file.md.json @@ -0,0 +1,6 @@ +{ + "path": "tests/path_to_test/file.md", + "components": [ + "# Hello, world!" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__path_to_test__file.py.json b/tests/golden/legacy/components/tests__path_to_test__file.py.json new file mode 100644 index 0000000..479f3b5 --- /dev/null +++ b/tests/golden/legacy/components/tests__path_to_test__file.py.json @@ -0,0 +1,6 @@ +{ + "path": "tests/path_to_test/file.py", + "components": [ + "def hello_world()" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__path_to_test__file.txt.json b/tests/golden/legacy/components/tests__path_to_test__file.txt.json new file mode 100644 index 0000000..40c85e2 --- /dev/null +++ b/tests/golden/legacy/components/tests__path_to_test__file.txt.json @@ -0,0 +1,4 @@ +{ + "path": "tests/path_to_test/file.txt", + "components": [] +} \ No newline at end of file diff --git a/tests/golden/legacy/components/tests__path_to_test__version.py.json b/tests/golden/legacy/components/tests__path_to_test__version.py.json new file mode 100644 index 0000000..0383bdb --- /dev/null +++ b/tests/golden/legacy/components/tests__path_to_test__version.py.json @@ -0,0 +1,6 @@ +{ + "path": "tests/path_to_test/version.py", + "components": [ + "__version__ = \"1.2.3\"" + ] +} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__dot_dot__my_test_file.py.json b/tests/golden/legacy/counts/tests__dot_dot__my_test_file.py.json new file mode 100644 index 0000000..420bcb8 --- /dev/null +++ b/tests/golden/legacy/counts/tests__dot_dot__my_test_file.py.json @@ -0,0 +1 @@ +{"path": "tests/dot_dot/my_test_file.py", "count": {"n_tokens": 7, "n_lines": 2}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__dot_dot__nested_dir__.env.test.json b/tests/golden/legacy/counts/tests__dot_dot__nested_dir__.env.test.json new file mode 100644 index 0000000..e978f74 --- /dev/null +++ b/tests/golden/legacy/counts/tests__dot_dot__nested_dir__.env.test.json @@ -0,0 +1 @@ +{"path": "tests/dot_dot/nested_dir/.env.test", "count": {"n_tokens": 4, "n_lines": 0}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__dot_dot__nested_dir__pytest.ini.json b/tests/golden/legacy/counts/tests__dot_dot__nested_dir__pytest.ini.json new file mode 100644 index 0000000..9014079 --- /dev/null +++ b/tests/golden/legacy/counts/tests__dot_dot__nested_dir__pytest.ini.json @@ -0,0 +1 @@ +{"path": "tests/dot_dot/nested_dir/pytest.ini", "count": {"n_tokens": 20, "n_lines": 4}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__dot_dot__nested_dir__test_tp_dotdot.py.json b/tests/golden/legacy/counts/tests__dot_dot__nested_dir__test_tp_dotdot.py.json new file mode 100644 index 0000000..a7a8ceb --- /dev/null +++ b/tests/golden/legacy/counts/tests__dot_dot__nested_dir__test_tp_dotdot.py.json @@ -0,0 +1 @@ +{"path": "tests/dot_dot/nested_dir/test_tp_dotdot.py", "count": {"n_tokens": 362, "n_lines": 52}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json b/tests/golden/legacy/counts/tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json new file mode 100644 index 0000000..94c34f9 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/CUSTOMER-INVOICE.CBL", "count": {"n_tokens": 412, "n_lines": 60}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__JavaTest.java.json b/tests/golden/legacy/counts/tests__more_languages__group1__JavaTest.java.json new file mode 100644 index 0000000..d609ed8 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__JavaTest.java.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/JavaTest.java", "count": {"n_tokens": 578, "n_lines": 86}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__JuliaTest.jl.json b/tests/golden/legacy/counts/tests__more_languages__group1__JuliaTest.jl.json new file mode 100644 index 0000000..d80c234 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__JuliaTest.jl.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/JuliaTest.jl", "count": {"n_tokens": 381, "n_lines": 63}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__KotlinTest.kt.json b/tests/golden/legacy/counts/tests__more_languages__group1__KotlinTest.kt.json new file mode 100644 index 0000000..a3ecb30 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__KotlinTest.kt.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/KotlinTest.kt", "count": {"n_tokens": 974, "n_lines": 171}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__LuaTest.lua.json b/tests/golden/legacy/counts/tests__more_languages__group1__LuaTest.lua.json new file mode 100644 index 0000000..3cfe8ec --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__LuaTest.lua.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/LuaTest.lua", "count": {"n_tokens": 83, "n_lines": 16}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__ObjectiveCTest.m.json b/tests/golden/legacy/counts/tests__more_languages__group1__ObjectiveCTest.m.json new file mode 100644 index 0000000..aeeab27 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__ObjectiveCTest.m.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/ObjectiveCTest.m", "count": {"n_tokens": 62, "n_lines": 16}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__OcamlTest.ml.json b/tests/golden/legacy/counts/tests__more_languages__group1__OcamlTest.ml.json new file mode 100644 index 0000000..9746ea6 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__OcamlTest.ml.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/OcamlTest.ml", "count": {"n_tokens": 49, "n_lines": 12}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__addamt.cobol.json b/tests/golden/legacy/counts/tests__more_languages__group1__addamt.cobol.json new file mode 100644 index 0000000..1771956 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__addamt.cobol.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/addamt.cobol", "count": {"n_tokens": 441, "n_lines": 40}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__lesson.cbl.json b/tests/golden/legacy/counts/tests__more_languages__group1__lesson.cbl.json new file mode 100644 index 0000000..6d5f22c --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__lesson.cbl.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/lesson.cbl", "count": {"n_tokens": 635, "n_lines": 78}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__test.js.json b/tests/golden/legacy/counts/tests__more_languages__group1__test.js.json new file mode 100644 index 0000000..f36977c --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__test.js.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/test.js", "count": {"n_tokens": 757, "n_lines": 154}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group1__test.ts.json b/tests/golden/legacy/counts/tests__more_languages__group1__test.ts.json new file mode 100644 index 0000000..1e61cdc --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group1__test.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group1/test.ts", "count": {"n_tokens": 832, "n_lines": 165}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group2__PerlTest.pl.json b/tests/golden/legacy/counts/tests__more_languages__group2__PerlTest.pl.json new file mode 100644 index 0000000..ffbf1c6 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group2__PerlTest.pl.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group2/PerlTest.pl", "count": {"n_tokens": 63, "n_lines": 20}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group2__PhpTest.php.json b/tests/golden/legacy/counts/tests__more_languages__group2__PhpTest.php.json new file mode 100644 index 0000000..04772fc --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group2__PhpTest.php.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group2/PhpTest.php", "count": {"n_tokens": 70, "n_lines": 19}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group2__PowershellTest.ps1.json b/tests/golden/legacy/counts/tests__more_languages__group2__PowershellTest.ps1.json new file mode 100644 index 0000000..2b51259 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group2__PowershellTest.ps1.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group2/PowershellTest.ps1", "count": {"n_tokens": 459, "n_lines": 89}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group2__ScalaTest.scala.json b/tests/golden/legacy/counts/tests__more_languages__group2__ScalaTest.scala.json new file mode 100644 index 0000000..e56d238 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group2__ScalaTest.scala.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group2/ScalaTest.scala", "count": {"n_tokens": 171, "n_lines": 40}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group2__apl_test.apl.json b/tests/golden/legacy/counts/tests__more_languages__group2__apl_test.apl.json new file mode 100644 index 0000000..bffd8b0 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group2__apl_test.apl.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group2/apl_test.apl", "count": {"n_tokens": 28, "n_lines": 5}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group2__c_test.c.json b/tests/golden/legacy/counts/tests__more_languages__group2__c_test.c.json new file mode 100644 index 0000000..4a7a6b3 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group2__c_test.c.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group2/c_test.c", "count": {"n_tokens": 837, "n_lines": 142}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group2__go_test.go.json b/tests/golden/legacy/counts/tests__more_languages__group2__go_test.go.json new file mode 100644 index 0000000..5dbdc0c --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group2__go_test.go.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group2/go_test.go", "count": {"n_tokens": 179, "n_lines": 46}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group2__test.csv.json b/tests/golden/legacy/counts/tests__more_languages__group2__test.csv.json new file mode 100644 index 0000000..cde6a1f --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group2__test.csv.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group2/test.csv", "count": null} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__bash_test.sh.json b/tests/golden/legacy/counts/tests__more_languages__group3__bash_test.sh.json new file mode 100644 index 0000000..dde9049 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__bash_test.sh.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/bash_test.sh", "count": {"n_tokens": 127, "n_lines": 22}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__cpp_test.cpp.json b/tests/golden/legacy/counts/tests__more_languages__group3__cpp_test.cpp.json new file mode 100644 index 0000000..aa377d1 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__cpp_test.cpp.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/cpp_test.cpp", "count": {"n_tokens": 1670, "n_lines": 259}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__csharp_test.cs.json b/tests/golden/legacy/counts/tests__more_languages__group3__csharp_test.cs.json new file mode 100644 index 0000000..8b73841 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__csharp_test.cs.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/csharp_test.cs", "count": {"n_tokens": 957, "n_lines": 146}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__hallucination.tex.json b/tests/golden/legacy/counts/tests__more_languages__group3__hallucination.tex.json new file mode 100644 index 0000000..d3986c7 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__hallucination.tex.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/hallucination.tex", "count": {"n_tokens": 1633, "n_lines": 126}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__ruby_test.rb.json b/tests/golden/legacy/counts/tests__more_languages__group3__ruby_test.rb.json new file mode 100644 index 0000000..fccfc0c --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__ruby_test.rb.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/ruby_test.rb", "count": {"n_tokens": 138, "n_lines": 37}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__swift_test.swift.json b/tests/golden/legacy/counts/tests__more_languages__group3__swift_test.swift.json new file mode 100644 index 0000000..09aabd8 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__swift_test.swift.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/swift_test.swift", "count": {"n_tokens": 469, "n_lines": 110}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test.capnp.json b/tests/golden/legacy/counts/tests__more_languages__group3__test.capnp.json new file mode 100644 index 0000000..ca14f50 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test.capnp.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test.capnp", "count": {"n_tokens": 117, "n_lines": 30}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test.graphql.json b/tests/golden/legacy/counts/tests__more_languages__group3__test.graphql.json new file mode 100644 index 0000000..f4b1aa9 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test.graphql.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test.graphql", "count": {"n_tokens": 66, "n_lines": 21}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test.lean.json b/tests/golden/legacy/counts/tests__more_languages__group3__test.lean.json new file mode 100644 index 0000000..0a5ade6 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test.lean.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test.lean", "count": {"n_tokens": 289, "n_lines": 42}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test.proto.json b/tests/golden/legacy/counts/tests__more_languages__group3__test.proto.json new file mode 100644 index 0000000..3443075 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test.proto.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test.proto", "count": {"n_tokens": 142, "n_lines": 34}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test.sqlite.json b/tests/golden/legacy/counts/tests__more_languages__group3__test.sqlite.json new file mode 100644 index 0000000..799c090 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test.sqlite.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test.sqlite", "count": null} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test_Cargo.toml.json b/tests/golden/legacy/counts/tests__more_languages__group3__test_Cargo.toml.json new file mode 100644 index 0000000..0a5a448 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test_Cargo.toml.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test_Cargo.toml", "count": {"n_tokens": 119, "n_lines": 18}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test_json_rpc_2_0.json.json b/tests/golden/legacy/counts/tests__more_languages__group3__test_json_rpc_2_0.json.json new file mode 100644 index 0000000..91d3368 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test_json_rpc_2_0.json.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test_json_rpc_2_0.json", "count": {"n_tokens": 26, "n_lines": 6}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test_openapi.yaml.json b/tests/golden/legacy/counts/tests__more_languages__group3__test_openapi.yaml.json new file mode 100644 index 0000000..d9a8384 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test_openapi.yaml.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test_openapi.yaml", "count": {"n_tokens": 753, "n_lines": 92}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test_openrpc.json.json b/tests/golden/legacy/counts/tests__more_languages__group3__test_openrpc.json.json new file mode 100644 index 0000000..e8242e1 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test_openrpc.json.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test_openrpc.json", "count": {"n_tokens": 225, "n_lines": 44}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group3__test_pyproject.toml.json b/tests/golden/legacy/counts/tests__more_languages__group3__test_pyproject.toml.json new file mode 100644 index 0000000..3bba2c7 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group3__test_pyproject.toml.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group3/test_pyproject.toml", "count": {"n_tokens": 304, "n_lines": 39}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__RTest.R.json b/tests/golden/legacy/counts/tests__more_languages__group4__RTest.R.json new file mode 100644 index 0000000..5ed60a1 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__RTest.R.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/RTest.R", "count": {"n_tokens": 367, "n_lines": 46}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__erl_test.erl.json b/tests/golden/legacy/counts/tests__more_languages__group4__erl_test.erl.json new file mode 100644 index 0000000..24cbc9b --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__erl_test.erl.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/erl_test.erl", "count": {"n_tokens": 480, "n_lines": 68}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__haskell_test.hs.json b/tests/golden/legacy/counts/tests__more_languages__group4__haskell_test.hs.json new file mode 100644 index 0000000..834fc76 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__haskell_test.hs.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/haskell_test.hs", "count": {"n_tokens": 414, "n_lines": 41}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__mathematica_test.nb.json b/tests/golden/legacy/counts/tests__more_languages__group4__mathematica_test.nb.json new file mode 100644 index 0000000..78d9c43 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__mathematica_test.nb.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/mathematica_test.nb", "count": {"n_tokens": 133, "n_lines": 21}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__matlab_test.m.json b/tests/golden/legacy/counts/tests__more_languages__group4__matlab_test.m.json new file mode 100644 index 0000000..eac0086 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__matlab_test.m.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/matlab_test.m", "count": {"n_tokens": 48, "n_lines": 12}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__rust_test.rs.json b/tests/golden/legacy/counts/tests__more_languages__group4__rust_test.rs.json new file mode 100644 index 0000000..1de30f6 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__rust_test.rs.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/rust_test.rs", "count": {"n_tokens": 1368, "n_lines": 259}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__test.zig.json b/tests/golden/legacy/counts/tests__more_languages__group4__test.zig.json new file mode 100644 index 0000000..ecfb7f3 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__test.zig.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/test.zig", "count": {"n_tokens": 397, "n_lines": 60}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__test_fsharp.fs.json b/tests/golden/legacy/counts/tests__more_languages__group4__test_fsharp.fs.json new file mode 100644 index 0000000..f02c099 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__test_fsharp.fs.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/test_fsharp.fs", "count": {"n_tokens": 92, "n_lines": 27}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__test_tcl_tk.tcl.json b/tests/golden/legacy/counts/tests__more_languages__group4__test_tcl_tk.tcl.json new file mode 100644 index 0000000..7dc2a06 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__test_tcl_tk.tcl.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/test_tcl_tk.tcl", "count": {"n_tokens": 54, "n_lines": 16}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group4__tf_test.tf.json b/tests/golden/legacy/counts/tests__more_languages__group4__tf_test.tf.json new file mode 100644 index 0000000..79a7efe --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group4__tf_test.tf.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group4/tf_test.tf", "count": {"n_tokens": 202, "n_lines": 38}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__Makefile.json b/tests/golden/legacy/counts/tests__more_languages__group5__Makefile.json new file mode 100644 index 0000000..6d78ef9 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__Makefile.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/Makefile", "count": {"n_tokens": 714, "n_lines": 84}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__ansible_test.yml.json b/tests/golden/legacy/counts/tests__more_languages__group5__ansible_test.yml.json new file mode 100644 index 0000000..849c20e --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__ansible_test.yml.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/ansible_test.yml", "count": {"n_tokens": 55, "n_lines": 14}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__app-routing.module.ts.json b/tests/golden/legacy/counts/tests__more_languages__group5__app-routing.module.ts.json new file mode 100644 index 0000000..b87fca8 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__app-routing.module.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/app-routing.module.ts", "count": {"n_tokens": 287, "n_lines": 28}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__app.component.spec.ts.json b/tests/golden/legacy/counts/tests__more_languages__group5__app.component.spec.ts.json new file mode 100644 index 0000000..ee298bf --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__app.component.spec.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/app.component.spec.ts", "count": {"n_tokens": 410, "n_lines": 47}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__app.component.ts.json b/tests/golden/legacy/counts/tests__more_languages__group5__app.component.ts.json new file mode 100644 index 0000000..13f3ef0 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__app.component.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/app.component.ts", "count": {"n_tokens": 271, "n_lines": 45}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__app.module.ts.json b/tests/golden/legacy/counts/tests__more_languages__group5__app.module.ts.json new file mode 100644 index 0000000..e07b2b5 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__app.module.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/app.module.ts", "count": {"n_tokens": 374, "n_lines": 43}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__checkbox_test.md.json b/tests/golden/legacy/counts/tests__more_languages__group5__checkbox_test.md.json new file mode 100644 index 0000000..ff3ddd4 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__checkbox_test.md.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/checkbox_test.md", "count": {"n_tokens": 191, "n_lines": 29}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__checkbox_test.txt.json b/tests/golden/legacy/counts/tests__more_languages__group5__checkbox_test.txt.json new file mode 100644 index 0000000..d284fb3 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__checkbox_test.txt.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/checkbox_test.txt", "count": {"n_tokens": 257, "n_lines": 33}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__environment.test.ts.json b/tests/golden/legacy/counts/tests__more_languages__group5__environment.test.ts.json new file mode 100644 index 0000000..cc69a6f --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__environment.test.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/environment.test.ts", "count": {"n_tokens": 197, "n_lines": 19}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__hello_world.pyi.json b/tests/golden/legacy/counts/tests__more_languages__group5__hello_world.pyi.json new file mode 100644 index 0000000..437c324 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__hello_world.pyi.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/hello_world.pyi", "count": {"n_tokens": 22, "n_lines": 3}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__k8s_test.yaml.json b/tests/golden/legacy/counts/tests__more_languages__group5__k8s_test.yaml.json new file mode 100644 index 0000000..4292243 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__k8s_test.yaml.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/k8s_test.yaml", "count": {"n_tokens": 140, "n_lines": 37}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__requirements_test.txt.json b/tests/golden/legacy/counts/tests__more_languages__group5__requirements_test.txt.json new file mode 100644 index 0000000..181e2dc --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__requirements_test.txt.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/requirements_test.txt", "count": {"n_tokens": 29, "n_lines": 10}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__rust_todo_test.rs.json b/tests/golden/legacy/counts/tests__more_languages__group5__rust_todo_test.rs.json new file mode 100644 index 0000000..1abd88d --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__rust_todo_test.rs.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/rust_todo_test.rs", "count": {"n_tokens": 92, "n_lines": 26}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__sql_test.sql.json b/tests/golden/legacy/counts/tests__more_languages__group5__sql_test.sql.json new file mode 100644 index 0000000..24de39a --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__sql_test.sql.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/sql_test.sql", "count": {"n_tokens": 270, "n_lines": 51}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__standard-app-routing.module.ts.json b/tests/golden/legacy/counts/tests__more_languages__group5__standard-app-routing.module.ts.json new file mode 100644 index 0000000..09a2581 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__standard-app-routing.module.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/standard-app-routing.module.ts", "count": {"n_tokens": 100, "n_lines": 16}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__test.env.json b/tests/golden/legacy/counts/tests__more_languages__group5__test.env.json new file mode 100644 index 0000000..f730c12 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__test.env.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/test.env", "count": {"n_tokens": 190, "n_lines": 25}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__testJsonSchema.json.json b/tests/golden/legacy/counts/tests__more_languages__group5__testJsonSchema.json.json new file mode 100644 index 0000000..9b32c46 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__testJsonSchema.json.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/testJsonSchema.json", "count": {"n_tokens": 421, "n_lines": 48}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__testPackage.json.json b/tests/golden/legacy/counts/tests__more_languages__group5__testPackage.json.json new file mode 100644 index 0000000..506d040 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__testPackage.json.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/testPackage.json", "count": {"n_tokens": 349, "n_lines": 43}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group5__tickets.component.ts.json b/tests/golden/legacy/counts/tests__more_languages__group5__tickets.component.ts.json new file mode 100644 index 0000000..6a1a7e4 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group5__tickets.component.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group5/tickets.component.ts", "count": {"n_tokens": 7160, "n_lines": 903}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__Microsoft.PowerShell_profile.ps1.json b/tests/golden/legacy/counts/tests__more_languages__group6__Microsoft.PowerShell_profile.ps1.json new file mode 100644 index 0000000..a44b201 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__Microsoft.PowerShell_profile.ps1.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/Microsoft.PowerShell_profile.ps1", "count": {"n_tokens": 3346, "n_lines": 497}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__catastrophic.c.json b/tests/golden/legacy/counts/tests__more_languages__group6__catastrophic.c.json new file mode 100644 index 0000000..b06ea26 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__catastrophic.c.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/catastrophic.c", "count": {"n_tokens": 5339, "n_lines": 754}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.cc.json b/tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.cc.json new file mode 100644 index 0000000..74a36c0 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.cc.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/cpp_examples_impl.cc", "count": {"n_tokens": 60, "n_lines": 10}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.cu.json b/tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.cu.json new file mode 100644 index 0000000..d3d35d0 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.cu.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/cpp_examples_impl.cu", "count": {"n_tokens": 37, "n_lines": 10}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.h.json b/tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.h.json new file mode 100644 index 0000000..87ff03e --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__cpp_examples_impl.h.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/cpp_examples_impl.h", "count": {"n_tokens": 22, "n_lines": 6}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__edge_case.hpp.json b/tests/golden/legacy/counts/tests__more_languages__group6__edge_case.hpp.json new file mode 100644 index 0000000..fd7eec6 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__edge_case.hpp.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/edge_case.hpp", "count": {"n_tokens": 426, "n_lines": 28}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__fractal.thy.json b/tests/golden/legacy/counts/tests__more_languages__group6__fractal.thy.json new file mode 100644 index 0000000..0c10d28 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__fractal.thy.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/fractal.thy", "count": {"n_tokens": 1712, "n_lines": 147}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__python_complex_class.py.json b/tests/golden/legacy/counts/tests__more_languages__group6__python_complex_class.py.json new file mode 100644 index 0000000..552722b --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__python_complex_class.py.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/python_complex_class.py", "count": {"n_tokens": 10, "n_lines": 2}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__ramda__cloneRegExp.js.json b/tests/golden/legacy/counts/tests__more_languages__group6__ramda__cloneRegExp.js.json new file mode 100644 index 0000000..69e6c0e --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__ramda__cloneRegExp.js.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/ramda__cloneRegExp.js", "count": {"n_tokens": 173, "n_lines": 9}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__ramda_prop.js.json b/tests/golden/legacy/counts/tests__more_languages__group6__ramda_prop.js.json new file mode 100644 index 0000000..617c804 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__ramda_prop.js.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/ramda_prop.js", "count": {"n_tokens": 646, "n_lines": 85}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__tensorflow_flags.h.json b/tests/golden/legacy/counts/tests__more_languages__group6__tensorflow_flags.h.json new file mode 100644 index 0000000..8ec45ac --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__tensorflow_flags.h.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/tensorflow_flags.h", "count": {"n_tokens": 7628, "n_lines": 668}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__test.f.json b/tests/golden/legacy/counts/tests__more_languages__group6__test.f.json new file mode 100644 index 0000000..3342682 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__test.f.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/test.f", "count": {"n_tokens": 181, "n_lines": 30}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__torch.rst.json b/tests/golden/legacy/counts/tests__more_languages__group6__torch.rst.json new file mode 100644 index 0000000..e45381d --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__torch.rst.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/torch.rst", "count": {"n_tokens": 60, "n_lines": 8}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group6__yc.html.json b/tests/golden/legacy/counts/tests__more_languages__group6__yc.html.json new file mode 100644 index 0000000..6cf052f --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group6__yc.html.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group6/yc.html", "count": {"n_tokens": 9063, "n_lines": 169}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group7__absurdly_huge.jsonl.json b/tests/golden/legacy/counts/tests__more_languages__group7__absurdly_huge.jsonl.json new file mode 100644 index 0000000..14e1246 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group7__absurdly_huge.jsonl.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group7/absurdly_huge.jsonl", "count": {"n_tokens": 8347, "n_lines": 126}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group7__angular_crud.ts.json b/tests/golden/legacy/counts/tests__more_languages__group7__angular_crud.ts.json new file mode 100644 index 0000000..7b415c4 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group7__angular_crud.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group7/angular_crud.ts", "count": {"n_tokens": 1192, "n_lines": 148}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group7__structure.py.json b/tests/golden/legacy/counts/tests__more_languages__group7__structure.py.json new file mode 100644 index 0000000..a72b9aa --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group7__structure.py.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group7/structure.py", "count": {"n_tokens": 400, "n_lines": 92}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group7__test.metal.json b/tests/golden/legacy/counts/tests__more_languages__group7__test.metal.json new file mode 100644 index 0000000..3ccf32a --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group7__test.metal.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group7/test.metal", "count": {"n_tokens": 272, "n_lines": 34}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group7__test.wgsl.json b/tests/golden/legacy/counts/tests__more_languages__group7__test.wgsl.json new file mode 100644 index 0000000..6abca09 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group7__test.wgsl.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group7/test.wgsl", "count": {"n_tokens": 528, "n_lines": 87}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_lisp__LispTest.lisp.json b/tests/golden/legacy/counts/tests__more_languages__group_lisp__LispTest.lisp.json new file mode 100644 index 0000000..d20eb39 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_lisp__LispTest.lisp.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_lisp/LispTest.lisp", "count": {"n_tokens": 25, "n_lines": 6}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_lisp__clojure_test.clj.json b/tests/golden/legacy/counts/tests__more_languages__group_lisp__clojure_test.clj.json new file mode 100644 index 0000000..f967933 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_lisp__clojure_test.clj.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_lisp/clojure_test.clj", "count": {"n_tokens": 682, "n_lines": 85}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_lisp__racket_struct.rkt.json b/tests/golden/legacy/counts/tests__more_languages__group_lisp__racket_struct.rkt.json new file mode 100644 index 0000000..caef743 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_lisp__racket_struct.rkt.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_lisp/racket_struct.rkt", "count": {"n_tokens": 14, "n_lines": 1}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_lisp__test_scheme.scm.json b/tests/golden/legacy/counts/tests__more_languages__group_lisp__test_scheme.scm.json new file mode 100644 index 0000000..79f8691 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_lisp__test_scheme.scm.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_lisp/test_scheme.scm", "count": {"n_tokens": 360, "n_lines": 44}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__AAPLShaders.metal.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__AAPLShaders.metal.json new file mode 100644 index 0000000..b2ed24a --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__AAPLShaders.metal.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/AAPLShaders.metal", "count": {"n_tokens": 5780, "n_lines": 566}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__crystal_test.cr.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__crystal_test.cr.json new file mode 100644 index 0000000..994d649 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__crystal_test.cr.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/crystal_test.cr", "count": {"n_tokens": 48, "n_lines": 15}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__dart_test.dart.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__dart_test.dart.json new file mode 100644 index 0000000..12befcf --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__dart_test.dart.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/dart_test.dart", "count": {"n_tokens": 108, "n_lines": 24}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__elixir_test.exs.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__elixir_test.exs.json new file mode 100644 index 0000000..e733a3d --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__elixir_test.exs.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/elixir_test.exs", "count": {"n_tokens": 39, "n_lines": 10}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__forward.frag.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__forward.frag.json new file mode 100644 index 0000000..7cb73d6 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__forward.frag.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/forward.frag", "count": {"n_tokens": 739, "n_lines": 87}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__forward.vert.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__forward.vert.json new file mode 100644 index 0000000..61ce20b --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__forward.vert.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/forward.vert", "count": {"n_tokens": 359, "n_lines": 48}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__nodemon.json.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__nodemon.json.json new file mode 100644 index 0000000..f1517fa --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__nodemon.json.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/nodemon.json", "count": {"n_tokens": 118, "n_lines": 20}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__sas_test.sas.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__sas_test.sas.json new file mode 100644 index 0000000..409afe8 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__sas_test.sas.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/sas_test.sas", "count": {"n_tokens": 97, "n_lines": 22}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__testTypings.d.ts.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__testTypings.d.ts.json new file mode 100644 index 0000000..b9fceca --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__testTypings.d.ts.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/testTypings.d.ts", "count": {"n_tokens": 158, "n_lines": 23}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__test_setup_py.test.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__test_setup_py.test.json new file mode 100644 index 0000000..6d094ea --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__test_setup_py.test.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/test_setup_py.test", "count": {"n_tokens": 133, "n_lines": 24}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__vba_test.bas.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__vba_test.bas.json new file mode 100644 index 0000000..eb12356 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__vba_test.bas.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/vba_test.bas", "count": {"n_tokens": 67, "n_lines": 16}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__more_languages__group_todo__wgsl_test.wgsl.json b/tests/golden/legacy/counts/tests__more_languages__group_todo__wgsl_test.wgsl.json new file mode 100644 index 0000000..5cff913 --- /dev/null +++ b/tests/golden/legacy/counts/tests__more_languages__group_todo__wgsl_test.wgsl.json @@ -0,0 +1 @@ +{"path": "tests/more_languages/group_todo/wgsl_test.wgsl", "count": {"n_tokens": 94, "n_lines": 17}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__path_to_test__class_method_type.py.json b/tests/golden/legacy/counts/tests__path_to_test__class_method_type.py.json new file mode 100644 index 0000000..3f64fea --- /dev/null +++ b/tests/golden/legacy/counts/tests__path_to_test__class_method_type.py.json @@ -0,0 +1 @@ +{"path": "tests/path_to_test/class_method_type.py", "count": {"n_tokens": 525, "n_lines": 101}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__path_to_test__empty.py.json b/tests/golden/legacy/counts/tests__path_to_test__empty.py.json new file mode 100644 index 0000000..473ba6d --- /dev/null +++ b/tests/golden/legacy/counts/tests__path_to_test__empty.py.json @@ -0,0 +1 @@ +{"path": "tests/path_to_test/empty.py", "count": {"n_tokens": 0, "n_lines": 0}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__path_to_test__file.md.json b/tests/golden/legacy/counts/tests__path_to_test__file.md.json new file mode 100644 index 0000000..a7af78b --- /dev/null +++ b/tests/golden/legacy/counts/tests__path_to_test__file.md.json @@ -0,0 +1 @@ +{"path": "tests/path_to_test/file.md", "count": {"n_tokens": 11, "n_lines": 2}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__path_to_test__file.py.json b/tests/golden/legacy/counts/tests__path_to_test__file.py.json new file mode 100644 index 0000000..241f305 --- /dev/null +++ b/tests/golden/legacy/counts/tests__path_to_test__file.py.json @@ -0,0 +1 @@ +{"path": "tests/path_to_test/file.py", "count": {"n_tokens": 18, "n_lines": 3}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__path_to_test__file.txt.json b/tests/golden/legacy/counts/tests__path_to_test__file.txt.json new file mode 100644 index 0000000..d14312a --- /dev/null +++ b/tests/golden/legacy/counts/tests__path_to_test__file.txt.json @@ -0,0 +1 @@ +{"path": "tests/path_to_test/file.txt", "count": {"n_tokens": 10, "n_lines": 2}} \ No newline at end of file diff --git a/tests/golden/legacy/counts/tests__path_to_test__version.py.json b/tests/golden/legacy/counts/tests__path_to_test__version.py.json new file mode 100644 index 0000000..6270cd3 --- /dev/null +++ b/tests/golden/legacy/counts/tests__path_to_test__version.py.json @@ -0,0 +1 @@ +{"path": "tests/path_to_test/version.py", "count": {"n_tokens": 13, "n_lines": 2}} \ No newline at end of file diff --git a/tests/golden/legacy/trees/dot_dot.txt b/tests/golden/legacy/trees/dot_dot.txt new file mode 100644 index 0000000..d6db06d --- /dev/null +++ b/tests/golden/legacy/trees/dot_dot.txt @@ -0,0 +1,10 @@ +📁 dot_dot (2 folders, 4 files) +├── 📄 my_test_file.py (7 tokens, 2 lines) +│ └── def dot_dot_dot() +└── 📁 nested_dir (1 folder, 3 files) + ├── 📄 .env.test (4 tokens, 0 lines) + │ └── DEBUG_TREE_PLUS + ├── 📄 pytest.ini (20 tokens, 4 lines) + └── 📄 test_tp_dotdot.py (362 tokens, 52 lines) + ├── def ignore_tokens_lines_test(text: str) -> str + └── def test_tree_plus_dotdot() diff --git a/tests/golden/legacy/trees/more_languages.txt b/tests/golden/legacy/trees/more_languages.txt new file mode 100644 index 0000000..f199132 --- /dev/null +++ b/tests/golden/legacy/trees/more_languages.txt @@ -0,0 +1,2225 @@ +📁 more_languages (10 folders, 99 files) +├── 📁 group1 (1 folder, 11 files) +│ ├── 📄 addamt.cobol (441 tokens, 40 lines) +│ │ ├── IDENTIFICATION DIVISION. +│ │ ├── PROGRAM-ID. +│ │ │ ADDAMT. +│ │ ├── DATA DIVISION. +│ │ ├── WORKING-STORAGE SECTION. +│ │ ├── 01 KEYED-INPUT. +│ │ ├── 05 CUST-NO-IN. +│ │ ├── 05 AMT1-IN. +│ │ ├── 05 AMT2-IN. +│ │ ├── 05 AMT3-IN. +│ │ ├── 01 DISPLAYED-OUTPUT. +│ │ ├── 05 CUST-NO-OUT. +│ │ ├── 05 TOTAL-OUT. +│ │ ├── 01 MORE-DATA. +│ │ ├── PROCEDURE DIVISION. +│ │ └── 100-MAIN. +│ ├── 📄 CUSTOMER-INVOICE.CBL (412 tokens, 60 lines) +│ │ ├── IDENTIFICATION DIVISION. +│ │ ├── PROGRAM-ID. CUSTOMER-INVOICE. +│ │ ├── AUTHOR. JANE DOE. +│ │ ├── DATE. 2023-12-30. +│ │ ├── DATE-COMPILED. 06/30/10. +│ │ ├── DATE-WRITTEN. 12/34/56. +│ │ ├── ENVIRONMENT DIVISION. +│ │ ├── INPUT-OUTPUT SECTION. +│ │ ├── FILE-CONTROL. +│ │ ├── SELECT CUSTOMER-FILE. +│ │ ├── SELECT INVOICE-FILE. +│ │ ├── SELECT REPORT-FILE. +│ │ ├── DATA DIVISION. +│ │ ├── FILE SECTION. +│ │ ├── FD CUSTOMER-FILE. +│ │ ├── 01 CUSTOMER-RECORD. +│ │ ├── 05 CUSTOMER-ID. +│ │ ├── 05 CUSTOMER-NAME. +│ │ ├── 05 CUSTOMER-BALANCE. +│ │ ├── FD INVOICE-FILE. +│ │ ├── 01 INVOICE-RECORD. +│ │ ├── 05 INVOICE-ID. +│ │ ├── 05 CUSTOMER-ID. +│ │ ├── 05 INVOICE-AMOUNT. +│ │ ├── FD REPORT-FILE. +│ │ ├── 01 REPORT-RECORD. +│ │ ├── WORKING-STORAGE SECTION. +│ │ ├── 01 WS-CUSTOMER-FOUND. +│ │ ├── 01 WS-END-OF-FILE. +│ │ ├── 01 WS-TOTAL-BALANCE. +│ │ ├── PROCEDURE DIVISION. +│ │ ├── 0000-MAIN-ROUTINE. +│ │ ├── 1000-PROCESS-RECORDS. +│ │ ├── 1100-UPDATE-CUSTOMER-BALANCE. +│ │ └── END PROGRAM CUSTOMER-INVOICE. +│ ├── 📄 JavaTest.java (578 tokens, 86 lines) +│ │ ├── abstract class LivingBeing +│ │ ├── abstract void breathe() +│ │ ├── interface Communicator +│ │ ├── String communicate() +│ │ ├── @Log +│ │ ├── @Getter +│ │ ├── @Setter +│ │ ├── class Person extends LivingBeing implements Communicator +│ │ ├── Person(String name, int age) +│ │ ├── @Override +│ │ ├── void breathe() +│ │ ├── @Override +│ │ ├── public String communicate() +│ │ ├── void greet() +│ │ ├── String personalizedGreeting(String greeting, Optional +│ │ │ includeAge) +│ │ ├── @Singleton +│ │ ├── @RestController +│ │ ├── @SpringBootApplication +│ │ ├── public class Example +│ │ ├── @Inject +│ │ ├── public Example(Person person) +│ │ ├── @RequestMapping("/greet") +│ │ ├── String home(@RequestParam(value = "name", defaultValue = +│ │ │ "World") String name, +│ │ │ @RequestParam(value = "age", defaultValue = "30") +│ │ │ int age) +│ │ └── public static void main(String[] args) +│ ├── 📄 JuliaTest.jl (381 tokens, 63 lines) +│ │ ├── module JuliaTest_EdgeCase +│ │ ├── struct Location +│ │ │ name::String +│ │ │ lat::Float32 +│ │ │ lon::Float32 +│ │ │ end +│ │ ├── mutable struct mPerson +│ │ │ name::String +│ │ │ age::Int +│ │ │ end +│ │ ├── Base.@kwdef mutable struct Param +│ │ │ Δt::Float64 = 0.1 +│ │ │ n::Int64 +│ │ │ m::Int64 +│ │ │ end +│ │ ├── sic(x,y) +│ │ ├── welcome(l::Location) +│ │ ├── ∑(α, Ω) +│ │ ├── function noob() +│ │ │ end +│ │ ├── function ye_olde(hello::String, world::Location) +│ │ │ end +│ │ ├── function multiline_greet( +│ │ │ p::mPerson, +│ │ │ greeting::String +│ │ │ ) +│ │ │ end +│ │ ├── function julia_is_awesome(prob::DiffEqBase.AbstractDAEProblem{uType, +│ │ │ duType, tType, +│ │ │ isinplace}; +│ │ │ kwargs...) where {uType, duType, tType, isinplace} +│ │ │ end +│ │ └── end +│ ├── 📄 KotlinTest.kt (974 tokens, 171 lines) +│ │ ├── data class Person(val name: String) +│ │ ├── fun greet(person: Person) +│ │ ├── fun processItems(items: List, processor: (T) -> Unit) +│ │ ├── interface Source +│ │ ├── fun nextT(): T +│ │ ├── fun MutableList.swap(index1: Int, index2: Int) +│ │ ├── fun Any?.toString(): String +│ │ ├── tailrec fun findFixPoint(x: Double = 1.0): Double +│ │ ├── class GenericRepository +│ │ ├── fun getItem(id: Int): T? +│ │ ├── sealed interface Error +│ │ ├── sealed class IOError(): Error +│ │ ├── object Runner +│ │ ├── inline fun , T> run() : T +│ │ ├── infix fun Int.shl(x: Int): Int +│ │ ├── class MyStringCollection +│ │ ├── infix fun add(s: String) +│ │ ├── fun build() +│ │ ├── open class Base(p: Int) +│ │ ├── class Derived(p: Int) : Base(p) +│ │ ├── open class Shape +│ │ ├── open fun draw() +│ │ ├── fun fill() +│ │ ├── open fun edge(case: Int) +│ │ ├── interface Thingy +│ │ ├── fun edge() +│ │ ├── class Circle() : Shape(), Thingy +│ │ ├── override fun draw() +│ │ ├── final override fun edge(case: Int) +│ │ ├── interface Base +│ │ ├── fun print() +│ │ ├── class BaseImpl(val x: Int) : Base +│ │ ├── override fun print() +│ │ ├── internal class Derived(b: Base) : Base by b +│ │ ├── class Person constructor(firstName: String) +│ │ ├── class People( +│ │ │ firstNames: Array, +│ │ │ ages: Array(42), +│ │ │ ) +│ │ ├── fun edgeCases(): Boolean +│ │ ├── class Alien public @Inject constructor( +│ │ │ val firstName: String, +│ │ │ val lastName: String, +│ │ │ var age: Int, +│ │ │ val pets: MutableList = mutableListOf(), +│ │ │ ) +│ │ ├── fun objectOriented(): String +│ │ ├── enum class IntArithmetics : BinaryOperator, IntBinaryOperator +│ │ ├── PLUS { +│ │ │ override fun apply(t: Int, u: Int): Int +│ │ ├── TIMES { +│ │ │ override fun apply(t: Int, u: Int): Int +│ │ ├── override fun applyAsInt(t: Int, u: Int) +│ │ ├── fun reformat( +│ │ │ str: String, +│ │ │ normalizeCase: Boolean = true, +│ │ │ upperCaseFirstLetter: Boolean = true, +│ │ │ divideByCamelHumps: Boolean = false, +│ │ │ wordSeparator: Char = ' ', +│ │ │ ) +│ │ ├── operator fun Point.unaryMinus() +│ │ ├── abstract class Polygon +│ │ └── abstract fun draw() +│ ├── 📄 lesson.cbl (635 tokens, 78 lines) +│ │ ├── IDENTIFICATION DIVISION. +│ │ ├── PROGRAM-ID. CBL0002. +│ │ ├── AUTHOR. Otto B. Fun. +│ │ ├── ENVIRONMENT DIVISION. +│ │ ├── INPUT-OUTPUT SECTION. +│ │ ├── FILE-CONTROL. +│ │ ├── SELECT PRINT-LINE. +│ │ ├── SELECT ACCT-REC. +│ │ ├── DATA DIVISION. +│ │ ├── FILE SECTION. +│ │ ├── FD PRINT-LINE. +│ │ ├── 01 PRINT-REC. +│ │ ├── 05 ACCT-NO-O. +│ │ ├── 05 ACCT-LIMIT-O. +│ │ ├── 05 ACCT-BALANCE-O. +│ │ ├── 05 LAST-NAME-O. +│ │ ├── 05 FIRST-NAME-O. +│ │ ├── 05 COMMENTS-O. +│ │ ├── FD ACCT-REC. +│ │ ├── 01 ACCT-FIELDS. +│ │ ├── 05 ACCT-NO. +│ │ ├── 05 ACCT-LIMIT. +│ │ ├── 05 ACCT-BALANCE. +│ │ ├── 05 LAST-NAME. +│ │ ├── 05 FIRST-NAME. +│ │ ├── 05 CLIENT-ADDR. +│ │ ├── 10 STREET-ADDR. +│ │ ├── 10 CITY-COUNTY. +│ │ ├── 10 USA-STATE. +│ │ ├── 05 RESERVED. +│ │ ├── 05 COMMENTS. +│ │ ├── WORKING-STORAGE SECTION. +│ │ ├── 01 FLAGS. +│ │ ├── 05 LASTREC. +│ │ ├── PROCEDURE DIVISION. +│ │ ├── OPEN-FILES. +│ │ ├── READ-NEXT-RECORD. +│ │ ├── CLOSE-STOP. +│ │ ├── READ-RECORD. +│ │ └── WRITE-RECORD. +│ ├── 📄 LuaTest.lua (83 tokens, 16 lines) +│ │ ├── function HelloWorld.new +│ │ ├── function HelloWorld.greet +│ │ └── function say_hello +│ ├── 📄 ObjectiveCTest.m (62 tokens, 16 lines) +│ │ ├── @interface HelloWorld +│ │ ├── @interface HelloWorld -> (void) sayHello +│ │ ├── @implementation HelloWorld +│ │ ├── @implementation HelloWorld -> (void) sayHello +│ │ └── void sayHelloWorld() +│ ├── 📄 OcamlTest.ml (49 tokens, 12 lines) +│ │ ├── type color +│ │ ├── class hello +│ │ ├── class hello -> method say_hello +│ │ └── let main () +│ ├── 📄 test.js (757 tokens, 154 lines) +│ │ ├── class MyClass +│ │ ├── myMethod() +│ │ ├── async asyncMethod(a, b) +│ │ ├── methodWithDefaultParameters(a = 5, b = 10) +│ │ ├── multilineMethod( +│ │ │ c, +│ │ │ d +│ │ │ ) +│ │ ├── multilineMethodWithDefaults( +│ │ │ t = "tree", +│ │ │ p = "plus" +│ │ │ ) +│ │ ├── function myFunction(param1, param2) +│ │ ├── function multilineFunction( +│ │ │ param1, +│ │ │ param2 +│ │ │ ) +│ │ ├── const arrowFunction = () => +│ │ ├── const parametricArrow = (a, b) => +│ │ ├── function () +│ │ ├── function outerFunction(outerParam) +│ │ ├── function innerFunction(innerParam) +│ │ ├── innerFunction("inner") +│ │ ├── const myObject = { +│ │ ├── myMethod: function (stuff) +│ │ ├── let myArrowObject = { +│ │ ├── myArrow: ({ +│ │ │ a, +│ │ │ b, +│ │ │ c, +│ │ │ }) => +│ │ ├── const myAsyncArrowFunction = async () => +│ │ ├── function functionWithRestParameters(...args) +│ │ ├── const namedFunctionExpression = function myNamedFunction() +│ │ ├── const multilineArrowFunction = ( +│ │ │ a, +│ │ │ b +│ │ │ ) => +│ │ ├── function functionReturningFunction() +│ │ ├── return function () +│ │ ├── function destructuringOnMultipleLines({ +│ │ │ a, +│ │ │ b, +│ │ │ }) +│ │ ├── const arrowFunctionWithDestructuring = ({ a, b }) => +│ │ ├── const multilineDestructuringArrow = ({ +│ │ │ a, +│ │ │ b, +│ │ │ }) => +│ │ ├── async function asyncFunctionWithErrorHandling() +│ │ ├── class Car +│ │ ├── constructor(brand) +│ │ ├── present() +│ │ ├── class Model extends Car +│ │ ├── constructor(brand, mod) +│ │ ├── super(brand) +│ │ └── show() +│ └── 📄 test.ts (832 tokens, 165 lines) +│ ├── type MyType +│ ├── interface MyInterface +│ ├── class TsClass +│ ├── myMethod() +│ ├── myMethodWithArgs(param1: string, param2: number): void +│ ├── static myStaticMethod(param: T): T +│ ├── multilineMethod( +│ │ c: number, +│ │ d: number +│ │ ): number +│ ├── multilineMethodWithDefaults( +│ │ t: string = "tree", +│ │ p: string = "plus" +│ │ ): string +│ ├── export class AdvancedComponent implements MyInterface +│ ├── async myAsyncMethod( +│ │ a: string, +│ │ b: number, +│ │ c: string +│ │ ): Promise +│ ├── genericMethod( +│ │ arg1: T, +│ │ arg2: U +│ │ ): [T, U] +│ ├── export class TicketsComponent implements MyInterface +│ ├── async myAsyncMethod({ a, b, c }: { a: String; b: Number; c: String +│ │ }) +│ ├── function tsFunction() +│ ├── function tsFunctionSigned( +│ │ param1: number, +│ │ param2: number +│ │ ): void +│ ├── export default async function tsFunctionComplicated({ +│ │ a = 1 | 2, +│ │ b = "bob", +│ │ c = async () => "charlie", +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ c: () => Promise; +│ │ }): Promise +│ ├── return("Standalone function with parameters") +│ ├── const tsArrowFunctionSigned = ({ +│ │ a, +│ │ b, +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ }) => +│ ├── export const tsComplicatedArrow = async ({ +│ │ a = 1 | 2, +│ │ b = "bob", +│ │ c = async () => "charlie", +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ c: () => Promise; +│ │ }): Promise => +│ ├── const arrowFunction = () => +│ ├── const arrow = (a: String, b: Number) => +│ ├── const asyncArrowFunction = async () => +│ ├── const asyncArrow = async (a: String, b: Number) => +│ ├── let weirdArrow = () => +│ ├── const asyncPromiseArrow = async (): Promise => +│ ├── let myWeirdArrowSigned = (x: number): number => +│ ├── class Person +│ ├── constructor(private firstName: string, private lastName: string) +│ ├── getFullName(): string +│ ├── describe(): string +│ ├── class Employee extends Person +│ ├── constructor( +│ │ firstName: string, +│ │ lastName: string, +│ │ private jobTitle: string +│ │ ) +│ ├── super(firstName, lastName) +│ ├── describe(): string +│ ├── interface Shape +│ └── interface Square extends Shape +├── 📁 group2 (1 folder, 8 files) +│ ├── 📄 apl_test.apl (28 tokens, 5 lines) +│ │ ├── :Namespace HelloWorld +│ │ ├── :Namespace HelloWorld -> hello ← 'Hello, World!' +│ │ └── :Namespace HelloWorld -> plus ← {⍺+⍵} +│ ├── 📄 c_test.c (837 tokens, 142 lines) +│ │ ├── struct Point +│ │ ├── int x; +│ │ ├── int y; +│ │ ├── struct Point getOrigin() +│ │ ├── float mul_two_floats(float x1, float x2) +│ │ ├── enum days +│ │ ├── SUN, +│ │ ├── MON, +│ │ ├── TUE, +│ │ ├── WED, +│ │ ├── THU, +│ │ ├── FRI, +│ │ ├── SAT +│ │ ├── long add_two_longs(long x1, long x2) +│ │ ├── double multiplyByTwo(double num) +│ │ ├── char getFirstCharacter(char *str) +│ │ ├── void greet(Person p) +│ │ ├── typedef struct +│ │ ├── char name[50]; +│ │ ├── } Person; +│ │ ├── int main() +│ │ ├── int* getArrayStart(int arr[], int size) +│ │ ├── long complexFunctionWithMultipleArguments( +│ │ │ int param1, +│ │ │ double param2, +│ │ │ char *param3, +│ │ │ struct Point point +│ │ │ ) +│ │ ├── keyPattern *ACLKeyPatternCreate(sds pattern, int flags) +│ │ ├── sds sdsCatPatternString(sds base, keyPattern *pat) +│ │ ├── static int ACLCheckChannelAgainstList(list *reference, const char +│ │ │ *channel, int channellen, int is_pattern) +│ │ ├── while((ln = listNext(&li))) +│ │ ├── static struct config +│ │ ├── aeEventLoop *el; +│ │ ├── cliConnInfo conn_info; +│ │ ├── const char *hostsocket; +│ │ ├── int tls; +│ │ ├── struct cliSSLconfig sslconfig; +│ │ └── } config; +│ ├── 📄 go_test.go (179 tokens, 46 lines) +│ │ ├── type Greeting struct +│ │ ├── func (g Greeting) sayHello() +│ │ ├── func createGreeting(m string) Greeting +│ │ ├── type SomethingLong struct +│ │ ├── func (s *SomethingLong) WithAReasonableName( +│ │ │ ctx context.Context, +│ │ │ param1 string, +│ │ │ param2 int, +│ │ │ param3 map[string]interface{}, +│ │ │ callback func(int) error, +│ │ │ ) (resultType, error) +│ │ ├── type resultType struct +│ │ └── func main() +│ ├── 📄 PerlTest.pl (63 tokens, 20 lines) +│ │ ├── package PerlTest +│ │ ├── package PerlTest -> sub new +│ │ ├── package PerlTest -> sub hello +│ │ └── package PerlTest -> sub say_hello +│ ├── 📄 PhpTest.php (70 tokens, 19 lines) +│ │ ├── class HelloWorld +│ │ ├── class HelloWorld -> function sayHello +│ │ ├── function greet +│ │ ├── class Person +│ │ └── class Person -> function __construct +│ ├── 📄 PowershellTest.ps1 (459 tokens, 89 lines) +│ │ ├── function Say-Nothing() +│ │ ├── class Person +│ │ ├── Person([string]$name) +│ │ ├── [string]Greet() +│ │ ├── [string]GreetMany([int]$times) +│ │ ├── [string]GreetWithDetails([string]$greeting, [int]$times) +│ │ ├── [string]GreetMultiline( +│ │ │ [string]$greeting, +│ │ │ [int]$times +│ │ │ ) +│ │ ├── NoReturn([int]$times) +│ │ ├── NoReturnNoArgs() +│ │ ├── function Say-Hello([Person]$person) +│ │ ├── function Multi-Hello([Person]$personA, [Person]$personB) +│ │ ├── function Switch-Item +│ │ ├── param ([switch]$on) +│ │ ├── function Get-SmallFiles +│ │ ├── param ( +│ │ │ [PSDefaultValue(Help = '100')] +│ │ │ $Size = 100) +│ │ ├── function Get-User +│ │ ├── [CmdletBinding(DefaultParameterSetName="ID")] +│ │ ├── [OutputType("System.Int32", ParameterSetName="ID")] +│ │ ├── [OutputType([String], ParameterSetName="Name")] +│ │ ├── Param ( +│ │ │ [parameter(Mandatory=$true, ParameterSetName="ID")] +│ │ │ [Int[]] +│ │ │ $UserID, +│ │ │ [parameter(Mandatory=$true, ParameterSetName="Name")] +│ │ │ [String[]] +│ │ │ $UserName) +│ │ ├── filter Get-ErrorLog ([switch]$Message) +│ │ └── function global:MultilineSignature( +│ │ [string]$param1, +│ │ [int]$param2, +│ │ [Parameter(Mandatory=$true)] +│ │ [string]$param3 +│ │ ) +│ ├── 📄 ScalaTest.scala (171 tokens, 40 lines) +│ │ ├── def sumOfSquares(x: Int, y: Int): Int +│ │ ├── trait Bark +│ │ ├── def bark: String +│ │ ├── case class Person(name: String) +│ │ ├── class GenericClass[T]( +│ │ │ val data: T, +│ │ │ val count: Int +│ │ │ ) +│ │ ├── def getData: T +│ │ ├── object HelloWorld +│ │ ├── def greet(person: Person): Unit +│ │ ├── def main(args: Array[String]): Unit +│ │ ├── def complexFunction( +│ │ │ a: Int, +│ │ │ b: String, +│ │ │ c: Float +│ │ │ ): (Int, String) Option +│ │ └── def sumOfSquaresShort(x: Int, y: Int): Int +│ └── 📄 test.csv (0 tokens, 0 lines) +│ ├── Name +│ ├── Age +│ ├── Country +│ ├── City +│ └── Email +├── 📁 group3 (1 folder, 16 files) +│ ├── 📄 bash_test.sh (127 tokens, 22 lines) +│ │ ├── echo_hello_world() +│ │ ├── function fun_echo_hello_world() +│ │ ├── export SECRET +│ │ ├── alias md='make debug' +│ │ ├── add_alias() +│ │ └── create_conda_env() +│ ├── 📄 cpp_test.cpp (1,670 tokens, 259 lines) +│ │ ├── class Person +│ │ ├── std::string name; +│ │ ├── public: +│ │ ├── Person(std::string n) : name(n) +│ │ ├── void greet() +│ │ ├── void globalGreet() +│ │ ├── int main() +│ │ ├── void printMessage(const std::string &message) +│ │ ├── template +│ │ │ void printVector(const std::vector& vec) +│ │ ├── struct Point +│ │ ├── int x, y; +│ │ ├── Point(int x, int y) : x(x), y(y) +│ │ ├── class Animal +│ │ ├── public: +│ │ ├── Animal(const std::string &name) : name(name) +│ │ ├── virtual void speak() const +│ │ ├── virtual ~Animal() +│ │ ├── protected: +│ │ ├── std::string name; +│ │ ├── class Dog : public Animal +│ │ ├── public: +│ │ ├── Dog(const std::string &name) : Animal(name) +│ │ ├── void speak() const override +│ │ ├── class Cat : public Animal +│ │ ├── public: +│ │ ├── Cat(const std::string &name) : Animal(name) +│ │ ├── void speak() const override +│ │ ├── nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int +│ │ │ num_layers, +│ │ │ int batch_size, int max_seq_length, +│ │ │ float dropout, +│ │ │ bool bidirectional, bool +│ │ │ cudnn_allow_tf32, +│ │ │ int workspace_size, int +│ │ │ reserve_space_size) +│ │ ├── int main() +│ │ ├── enum ECarTypes +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV, +│ │ ├── Wagon +│ │ ├── ECarTypes GetPreferredCarType() +│ │ ├── enum ECarTypes : uint8_t +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV = 254, +│ │ ├── Hybrid +│ │ ├── enum class ECarTypes : uint8_t +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV = 254, +│ │ ├── Hybrid +│ │ ├── void myFunction(string fname, int age) +│ │ ├── template T cos(T) +│ │ ├── template T sin(T) +│ │ ├── template T sqrt(T) +│ │ ├── template struct VLEN +│ │ ├── template class arr +│ │ ├── private: +│ │ ├── static T *ralloc(size_t num) +│ │ ├── static void dealloc(T *ptr) +│ │ ├── static T *ralloc(size_t num) +│ │ ├── static void dealloc(T *ptr) +│ │ ├── public: +│ │ ├── arr() : p(0), sz(0) +│ │ ├── arr(size_t n) : p(ralloc(n)), sz(n) +│ │ ├── arr(arr &&other) +│ │ │ : p(other.p), sz(other.sz) +│ │ ├── ~arr() +│ │ ├── void resize(size_t n) +│ │ ├── T &operator[](size_t idx) +│ │ ├── T *data() +│ │ ├── size_t size() const +│ │ ├── class Buffer +│ │ ├── private: +│ │ ├── void* ptr_; +│ │ └── std::tuple quantize( +│ │ const array& w, +│ │ int group_size, +│ │ int bits, +│ │ StreamOrDevice s) +│ ├── 📄 csharp_test.cs (957 tokens, 146 lines) +│ │ ├── public interface IExcelTemplate +│ │ ├── void LoadTemplate(string templateFilePath) +│ │ ├── void LoadData(Dictionary data) +│ │ ├── void ModifyCell(string cellName, string value) +│ │ ├── void SaveToFile(string filePath) +│ │ ├── public interface IGreet +│ │ ├── void Greet() +│ │ ├── public enum WeekDays +│ │ ├── public delegate void DisplayMessage(string message) +│ │ ├── public struct Address +│ │ ├── public static class HelperFunctions +│ │ ├── public static void PrintMessage(string message) +│ │ ├── public static int AddNumbers(int a, int b) +│ │ ├── namespace HelloWorldApp +│ │ ├── class Person : IGreet +│ │ ├── public Person(string name, int age) +│ │ ├── public void Greet() +│ │ ├── class HelloWorld +│ │ ├── static void Main(string[] args) +│ │ ├── namespace TemplateToExcelServer.Template +│ │ ├── public interface ITemplateObject +│ │ ├── string[,] GetContent() +│ │ ├── string[] GetContentArray() +│ │ ├── string[] GetFormat() +│ │ ├── int? GetFormatLength() +│ │ ├── TemplateObject SetContent(string[,] Content) +│ │ ├── TemplateObject SetContentArray(string[] value) +│ │ ├── TemplateObject SetFormat(string[] Header) +│ │ ├── TemplateObject SetNameOfReport( +│ │ │ ReadOnlyMemory ReportName, +│ │ │ int[] EdgeCase) +│ │ ├── TemplateObject SetSheetName(ReadOnlyMemory SheetName) +│ │ ├── public class BankAccount(string accountID, string owner) +│ │ ├── public override string ToString() => +│ │ ├── var IncrementBy = (int source, int increment = 1) => +│ │ ├── Func add = (x, y) => +│ │ ├── button.Click += (sender, args) => +│ │ ├── public Func GetMultiplier(int factor) +│ │ ├── public void Method( +│ │ │ int param1, +│ │ │ int param2, +│ │ │ int param3, +│ │ │ int param4, +│ │ │ int param5, +│ │ │ int param6, +│ │ │ ) +│ │ ├── System.Net.ServicePointManager.ServerCertificateValidationCallback +│ │ │ += +│ │ │ (se, cert, chain, sslerror) => +│ │ ├── class ServerCertificateValidation +│ │ ├── public bool OnRemoteCertificateValidation( +│ │ │ object se, +│ │ │ X509Certificate cert, +│ │ │ X509Chain chain, +│ │ │ SslPolicyErrors sslerror +│ │ │ ) +│ │ ├── s_downloadButton.Clicked += async (o, e) => +│ │ ├── [HttpGet, Route("DotNetCount")] +│ │ └── static public async Task GetDotNetCount(string URL) +│ ├── 📄 hallucination.tex (1,633 tokens, 126 lines) +│ │ ├── Harnessing the Master Algorithm: Strategies for AI LLMs to Mitigate +│ │ │ Hallucinations +│ │ ├── Hallucinated Pedro Domingos et al. +│ │ ├── Christmas Eve 2023 +│ │ ├── 1 Introduction +│ │ ├── 2 Representation in LLMs +│ │ ├── 2.1 Current Representational Models +│ │ ├── 2.2 Incorporating Cognitive Structures +│ │ ├── 2.3 Conceptual Diagrams of Advanced Representational Models +│ │ ├── 3 Evaluation Strategies +│ │ ├── 3.1 Existing Evaluation Metrics for LLMs +│ │ ├── 3.2 Integrating Contextual and Ethical Considerations +│ │ ├── 3.3 Case Studies: Evaluation in Practice +│ │ ├── 4 Optimization Techniques +│ │ ├── 4.1 Continuous Learning Models +│ │ ├── 4.2 Adaptive Algorithms for Real-time Adjustments +│ │ ├── 4.3 Performance Metrics Pre- and Post-Optimization +│ │ ├── 5 Interdisciplinary Insights +│ │ ├── 5.1 Cognitive Science and AI: A Symbiotic Relationship +│ │ ├── 5.2 Learning from Human Cognitive Processes +│ │ ├── 6 Challenges and Future Directions +│ │ ├── 6.1 Addressing Current Limitations +│ │ ├── 6.2 The Road Ahead: Ethical and Practical Considerations +│ │ ├── 7 Conclusion +│ │ ├── 7.1 Summarizing Key Findings +│ │ └── 7.2 The Next Steps in AI Development +│ ├── 📄 ruby_test.rb (138 tokens, 37 lines) +│ │ ├── module Greeter +│ │ ├── def self.say_hello +│ │ ├── class HelloWorld +│ │ ├── def say_hello +│ │ ├── class Human +│ │ ├── def self.bar +│ │ ├── def self.bar=(value) +│ │ ├── class Doctor < Human +│ │ └── def brachial_plexus( +│ │ roots, +│ │ trunks, +│ │ divisions: true, +│ │ cords: [], +│ │ branches: Time.now +│ │ ) +│ ├── 📄 swift_test.swift (469 tokens, 110 lines) +│ │ ├── class Person +│ │ ├── init(name: String) +│ │ ├── func greet() +│ │ ├── func yEdgeCase( +│ │ │ fname: String, +│ │ │ lname: String, +│ │ │ age: Int, +│ │ │ address: String, +│ │ │ phoneNumber: String +│ │ │ ) +│ │ ├── func globalGreet() +│ │ ├── struct Point +│ │ ├── protocol Animal +│ │ ├── func speak() +│ │ ├── struct Dog: Animal +│ │ ├── class Cat: Animal +│ │ ├── init(name: String) +│ │ ├── func speak() +│ │ ├── enum CarType +│ │ ├── func getPreferredCarType() -> CarType +│ │ ├── enum CarType: UInt8 +│ │ ├── enum class CarType: UInt8 +│ │ ├── func myFunction(fname: String, age: Int) +│ │ └── func myFunctionWithMultipleParameters( +│ │ fname: String, +│ │ lname: String, +│ │ age: Int, +│ │ address: String, +│ │ phoneNumber: String +│ │ ) +│ ├── 📄 test.lean (289 tokens, 42 lines) +│ │ ├── # Advanced Topics in Group Theory +│ │ ├── section GroupDynamics +│ │ ├── lemma group_stability (G : Type*) [Group G] (H : Subgroup G) +│ │ ├── theorem subgroup_closure {G : Type*} [Group G] (S : Set G) +│ │ ├── axiom group_homomorphism_preservation {G H : Type*} [Group G] [Group +│ │ │ H] (f : G → H) +│ │ ├── end GroupDynamics +│ │ ├── section ConstructiveApproach +│ │ ├── lemma finite_group_order (G : Type*) [Group G] [Fintype G] +│ │ ├── lemma complex_lemma {X Y : Type*} [SomeClass X] [AnotherClass Y] +│ │ │ (f : X → Y) (g : Y → X) +│ │ └── end ConstructiveApproach +│ ├── 📄 test.capnp (117 tokens, 30 lines) +│ │ ├── struct Employee +│ │ ├── id @0 :Int32 +│ │ ├── name @1 :Text +│ │ ├── role @2 :Text +│ │ ├── skills @3 :List(Skill) +│ │ ├── struct Skill +│ │ ├── name @0 :Text +│ │ ├── level @1 :Level +│ │ ├── enum Level +│ │ ├── beginner @0 +│ │ ├── intermediate @1 +│ │ ├── expert @2 +│ │ ├── status :union +│ │ ├── active @4 :Void +│ │ ├── onLeave @5 :Void +│ │ ├── retired @6 :Void +│ │ ├── struct Company +│ │ └── employees @0 :List(Employee) +│ ├── 📄 test.graphql (66 tokens, 21 lines) +│ │ ├── type Query +│ │ ├── getBooks: [Book] +│ │ ├── getAuthors: [Author] +│ │ ├── type Mutation +│ │ ├── addBook(title: String, author: String): Book +│ │ ├── removeBook(id: ID): Book +│ │ ├── type Book +│ │ ├── id: ID +│ │ ├── title: String +│ │ ├── author: Author +│ │ ├── type Author +│ │ ├── id: ID +│ │ ├── name: String +│ │ └── books: [Book] +│ ├── 📄 test.proto (142 tokens, 34 lines) +│ │ ├── syntax = "proto3" +│ │ ├── service EmployeeService +│ │ ├── rpc GetEmployee(EmployeeId) returns (EmployeeInfo) +│ │ ├── rpc AddEmployee(EmployeeData) returns (EmployeeInfo) +│ │ ├── rpc UpdateEmployee(EmployeeUpdate) returns (EmployeeInfo) +│ │ ├── message EmployeeId +│ │ ├── int32 id = 1 +│ │ ├── message EmployeeInfo +│ │ ├── int32 id = 1 +│ │ ├── string name = 2 +│ │ ├── string role = 3 +│ │ ├── message EmployeeData +│ │ ├── string name = 1 +│ │ ├── string role = 2 +│ │ ├── message EmployeeUpdate +│ │ ├── int32 id = 1 +│ │ ├── string name = 2 +│ │ └── string role = 3 +│ ├── 📄 test.sqlite (0 tokens, 0 lines) +│ │ ├── students table: +│ │ ├── id integer primary key +│ │ ├── name text not null +│ │ ├── age integer not null +│ │ ├── courses table: +│ │ ├── id integer primary key +│ │ ├── title text not null +│ │ └── credits integer not null +│ ├── 📄 test_Cargo.toml (119 tokens, 18 lines) +│ │ ├── name: test_cargo +│ │ ├── version: 0.1.0 +│ │ ├── description: A test Cargo.toml +│ │ ├── license: MIT OR Apache-2.0 +│ │ ├── dependencies: +│ │ ├── clap 4.4 +│ │ └── sqlx 0.7 (features: runtime-tokio, tls-rustls) +│ ├── 📄 test_json_rpc_2_0.json (26 tokens, 6 lines) +│ │ ├── jsonrpc: 2.0 +│ │ ├── method: subtract +│ │ ├── params: +│ │ ├── minuend: 42 +│ │ ├── subtrahend: 23 +│ │ └── id: 1 +│ ├── 📄 test_openapi.yaml (753 tokens, 92 lines) +│ │ ├── openapi: 3.0.1 +│ │ ├── title: TODO Plugin +│ │ ├── description: A plugin to create and manage TODO lists using +│ │ │ ChatGPT. +│ │ ├── version: v1 +│ │ ├── servers: +│ │ ├── - url: PLUGIN_HOSTNAME +│ │ ├── paths: +│ │ ├── '/todos/{username}': +│ │ ├── GET (getTodos): Get the list of todos +│ │ ├── POST (addTodo): Add a todo to the list +│ │ └── DELETE (deleteTodo): Delete a todo from the list +│ ├── 📄 test_openrpc.json (225 tokens, 44 lines) +│ │ ├── openrpc: 1.2.1 +│ │ ├── info: +│ │ ├── title: Demo Petstore +│ │ ├── version: 1.0.0 +│ │ ├── methods: +│ │ ├── listPets: List all pets +│ │ ├── params: +│ │ ├── - limit: integer +│ │ └── result: pets = An array of pets +│ └── 📄 test_pyproject.toml (304 tokens, 39 lines) +│ ├── name: tree_plus +│ ├── version: 1.0.8 +│ ├── description: A `tree` util enhanced with tokens, lines, and +│ │ components. +│ ├── License :: OSI Approved :: Apache Software License +│ ├── License :: OSI Approved :: MIT License +│ ├── dependencies: +│ ├── tiktoken +│ ├── PyYAML +│ ├── click +│ ├── rich +│ └── tomli +├── 📁 group4 (1 folder, 10 files) +│ ├── 📄 erl_test.erl (480 tokens, 68 lines) +│ │ ├── -module(erl_test). +│ │ ├── -record(person). +│ │ ├── -type ra_peer_status(). +│ │ ├── -type ra_membership(). +│ │ ├── -opaque my_opaq_type(). +│ │ ├── -type orddict(Key, Val). +│ │ ├── -type edge( +│ │ │ Cases, +│ │ │ Pwn, +│ │ │ ). +│ │ ├── -spec guarded(X) -> X when X :: tuple(). +│ │ ├── -spec edge_case( +│ │ │ {integer(), any()} | [any()] +│ │ │ ) -> processed, integer(), any()} | [{item, any()}]. +│ │ ├── -spec complex_function({integer(), any()} | [any()]) -> +│ │ │ {processed, integer(), any()} | [{item, any()}]. +│ │ ├── -spec list_manipulation([integer()]) -> [integer()]. +│ │ ├── -spec overload(T1, T2) -> T3 +│ │ │ ; (T4, T5) -> T6. +│ │ ├── -spec multiguard({X, integer()}) -> X when X :: atom() +│ │ │ ; ([Y]) -> Y when Y :: number(). +│ │ ├── -record(multiline). +│ │ └── -record(maybe_undefined). +│ ├── 📄 haskell_test.hs (414 tokens, 41 lines) +│ │ ├── data Person +│ │ ├── greet :: Person -> String +│ │ └── resolveVariables :: +│ │ forall m fragments. +│ │ (MonadError QErr m, Traversable fragments) => +│ │ Options.BackwardsCompatibleNullInNonNullableVariables -> +│ │ [G.VariableDefinition] -> +│ │ GH.VariableValues -> +│ │ [G.Directive G.Name] -> +│ │ G.SelectionSet fragments G.Name -> +│ │ m +│ │ ( [G.Directive Variable], +│ │ G.SelectionSet fragments Variable +│ │ ) +│ ├── 📄 mathematica_test.nb (133 tokens, 21 lines) +│ │ ├── person[name_] +│ │ ├── sayHello[] +│ │ └── sumList[list_List] +│ ├── 📄 matlab_test.m (48 tokens, 12 lines) +│ │ ├── classdef HelloWorld -> function greet +│ │ └── function loneFun +│ ├── 📄 RTest.R (367 tokens, 46 lines) +│ │ ├── class(person) +│ │ ├── greet.Person <- function +│ │ ├── ensure_between = function +│ │ └── run_intermediate_annealing_process = function +│ ├── 📄 rust_test.rs (1,368 tokens, 259 lines) +│ │ ├── fn at_beginning<'a>(&'a str) +│ │ ├── pub enum Days { +│ │ │ #\[default] +│ │ │ Sun, +│ │ │ Mon, +│ │ │ #\[error("edge case {idx}, expected at least {} and at most {}", +│ │ │ .limits.lo, .limits.hi)] +│ │ │ Tue, +│ │ │ Wed, +│ │ │ Thu(i16, bool), +│ │ │ Fri { day: u8 }, +│ │ │ Sat { +│ │ │ urday: String, +│ │ │ edge_case: E, +│ │ │ }, +│ │ │ } +│ │ ├── struct Point +│ │ ├── impl Point +│ │ ├── fn get_origin() -> Point +│ │ ├── struct Person +│ │ ├── impl Person +│ │ ├── fn greet(&self) +│ │ ├── fn add_two_longs(x1: i64, x2: i64) -> i64 +│ │ ├── fn add_two_longs_longer( +│ │ │ x1: i64, +│ │ │ x2: i64, +│ │ │ ) -> i64 +│ │ ├── const fn multiply_by_two(num: f64) -> f64 +│ │ ├── fn get_first_character(s: &str) -> Option +│ │ ├── trait Drawable +│ │ ├── fn draw(&self) +│ │ ├── impl Drawable for Point +│ │ ├── fn draw(&self) +│ │ ├── fn with_generic(d: D) +│ │ ├── fn with_generic(d: D) +│ │ │ where +│ │ │ D: Drawable +│ │ ├── fn main() +│ │ ├── pub struct VisibleStruct +│ │ ├── mod my_module +│ │ ├── pub struct AlsoVisibleStruct(T, T) +│ │ ├── macro_rules! say_hello +│ │ ├── #[macro_export] +│ │ │ macro_rules! hello_tree_plus +│ │ ├── pub mod lib +│ │ ├── pub mod interfaces +│ │ ├── mod engine +│ │ ├── pub fn flow( +│ │ │ source: S1, +│ │ │ extractor: E, +│ │ │ inbox: S2, +│ │ │ transformer: T, +│ │ │ outbox: S3, +│ │ │ loader: L, +│ │ │ sink: &mut S4, +│ │ │ ) -> Result<(), Box> +│ │ │ where +│ │ │ S1: Extractable, +│ │ │ S2: Extractable + Loadable, +│ │ │ S3: Extractable + Loadable, +│ │ │ S4: Loadable, +│ │ │ E: Extractor, +│ │ │ T: Transformer, +│ │ │ L: Loader +│ │ ├── trait Container +│ │ ├── fn items(&self) -> impl Iterator +│ │ ├── trait HttpService +│ │ ├── async fn fetch(&self, url: Url) -> HtmlBody +│ │ ├── struct Pair +│ │ ├── trait Transformer +│ │ ├── fn transform(&self, input: T) -> T +│ │ ├── impl + Copy> Transformer for Pair +│ │ ├── fn transform(&self, input: T) -> T +│ │ ├── fn main() +│ │ ├── async fn handle_get(State(pool): State) -> +│ │ │ Result, (StatusCode, String)> +│ │ │ where +│ │ │ Bion: Cool +│ │ ├── #[macro_export] +│ │ │ macro_rules! unit +│ │ ├── fn insert( +│ │ │ &mut self, +│ │ │ key: (), +│ │ │ value: $unit_dtype, +│ │ │ ) -> Result, ETLError> +│ │ ├── pub async fn handle_get_axum_route( +│ │ │ Session { maybe_claims }: Session, +│ │ │ Path(RouteParams { +│ │ │ alpha, +│ │ │ bravo, +│ │ │ charlie, +│ │ │ edge_case +│ │ │ }): Path, +│ │ │ ) -> ServerResult +│ │ ├── fn encode_pipeline(cmds: &[Cmd], atomic: bool) -> Vec +│ │ ├── pub async fn handle_post_yeet( +│ │ │ State(auth_backend): State, +│ │ │ Session { maybe_claims }: Session, +│ │ │ Form(yeet_form): Form, +│ │ │ ) -> Result +│ │ └── pub async fn handle_get_thingy( +│ │ session: Session, +│ │ State(ApiBackend { +│ │ page_cache, +│ │ auth_backend, +│ │ library_sql, +│ │ some_data_cache, +│ │ metadata_cache, +│ │ thingy_client, +│ │ .. +│ │ }): State, +│ │ ) -> ServerResult +│ ├── 📄 test.zig (397 tokens, 60 lines) +│ │ ├── pub fn add(a: i32, b: i32) i32 +│ │ ├── test "add function" +│ │ ├── const BunBuildOptions = struct +│ │ ├── pub fn updateRuntime(this: *BunBuildOptions) anyerror!void +│ │ ├── pub fn step(this: BunBuildOptions, b: anytype) +│ │ │ *std.build.OptionsStep +│ │ └── pub fn sgemv( +│ │ order: Order, +│ │ trans: Trans, +│ │ m: usize, +│ │ n: usize, +│ │ alpha: f32, +│ │ a: []const f32, +│ │ lda: usize, +│ │ x: []const f32, +│ │ x_add: usize, +│ │ beta: f32, +│ │ y: []f32, +│ │ y_add: usize, +│ │ ) void +│ ├── 📄 test_fsharp.fs (92 tokens, 27 lines) +│ │ ├── module TestFSharp +│ │ ├── type Person = { +│ │ ├── let add x y = +│ │ ├── let multiply +│ │ │ (x: int) +│ │ │ (y: int): int = +│ │ ├── let complexFunction +│ │ │ (a: int) +│ │ │ (b: string) +│ │ │ (c: float) +│ │ │ : (int * string) option = +│ │ └── type Result<'T> = +│ ├── 📄 test_tcl_tk.tcl (54 tokens, 16 lines) +│ │ ├── proc sayHello {} +│ │ ├── proc arrg { input } +│ │ └── proc multiLine { +│ │ x, +│ │ y +│ │ } +│ └── 📄 tf_test.tf (202 tokens, 38 lines) +│ ├── provider "aws" +│ ├── resource "aws_instance" "example" +│ ├── data "aws_ami" "ubuntu" +│ ├── variable "instance_type" +│ ├── output "instance_public_ip" +│ ├── locals +│ └── module "vpc" +├── 📁 group5 (1 folder, 19 files) +│ ├── 📄 ansible_test.yml (55 tokens, 14 lines) +│ │ ├── Install package +│ │ ├── Start service +│ │ └── Create user +│ ├── 📄 app-routing.module.ts (287 tokens, 28 lines) +│ │ ├── const routes: Routes = [ +│ │ │ { path: '', redirectTo: 'login', pathMatch: 'full' }, +│ │ │ { path: '*', redirectTo: 'login' }, +│ │ │ { path: 'home', component: HomeComponent }, +│ │ │ { path: 'login', component: LoginComponent }, +│ │ │ { path: 'register', component: RegisterComponent }, +│ │ │ { path: 'events', component: EventsComponent }, +│ │ │ { path: 'invites', component: InvitesComponent }, +│ │ │ { path: 'rewards', component: RewardsComponent }, +│ │ │ { path: 'profile', component: ProfileComponent }, +│ │ │ ]; +│ │ └── export class AppRoutingModule +│ ├── 📄 app.component.spec.ts (410 tokens, 47 lines) +│ │ ├── describe 'AppComponent' +│ │ ├── it should create the app +│ │ ├── it should welcome the user +│ │ ├── it should welcome 'Jimbo' +│ │ └── it should request login if not logged in +│ ├── 📄 app.component.ts (271 tokens, 45 lines) +│ │ ├── export class AppComponent +│ │ ├── constructor( +│ │ │ private http: HttpClient, +│ │ │ private loginService: LoginService, +│ │ │ private stripeService: StripeService +│ │ │ ) +│ │ ├── constructor(private loginService: LoginService) +│ │ ├── checkSession() +│ │ ├── async goToEvent(event_id: string) +│ │ └── valInvitedBy(event: any, event_id: string) +│ ├── 📄 app.module.ts (374 tokens, 43 lines) +│ │ ├── @NgModule({ +│ │ │ declarations: [ +│ │ │ AppComponent, +│ │ │ HomeComponent, +│ │ │ LoginComponent, +│ │ │ RegisterComponent, +│ │ │ EventsComponent, +│ │ │ InvitesComponent, +│ │ │ RewardsComponent, +│ │ │ ProfileComponent +│ │ └── export class AppModule +│ ├── 📄 checkbox_test.md (191 tokens, 29 lines) +│ │ ├── # My Checkbox Test +│ │ ├── ## My No Parens Test +│ │ ├── ## My Empty href Test +│ │ ├── ## My other url Test [Q&A] +│ │ ├── ## My other other url Test [Q&A] +│ │ ├── ## My 2nd other url Test [Q&A] +│ │ ├── ## My 3rd other url Test [Q&A] +│ │ ├── - [ ] Task 1 +│ │ ├── - [ ] No Space Task 1.1 +│ │ ├── - [ ] Two Spaces Task 1.2 +│ │ ├── - [ ] Subtask 1.2.1 +│ │ ├── - [ ] Task 2 +│ │ ├── - [x] Task 3 +│ │ ├── - [ ] Subtask 3.1 +│ │ ├── - [x] Task 6 +│ │ ├── - [x] Subtask 6.1 +│ │ ├── - [ ] Handle edge cases +│ │ └── # My Codeblock Test +│ ├── 📄 checkbox_test.txt (257 tokens, 33 lines) +│ │ ├── - [ ] fix phone number format +1 +│ │ ├── - [ ] add forgot password +│ │ ├── - [ ] ? add email verification +│ │ ├── - [ ] store token the right way +│ │ ├── - [ ] test nesting of checkboxes +│ │ ├── - [ ] user can use option to buy ticket at 2-referred price +│ │ ├── - [ ] CTA refer 2 people to get instant lower price +│ │ └── - [ ] form to send referrals +│ ├── 📄 environment.test.ts (197 tokens, 19 lines) +│ │ ├── environment: +│ │ ├── production +│ │ ├── cognitoUserPoolId +│ │ ├── cognitoAppClientId +│ │ └── apiurl +│ ├── 📄 hello_world.pyi (22 tokens, 3 lines) +│ │ ├── @final +│ │ │ class dtype(Generic[_DTypeScalar_co]) +│ │ └── names: None | tuple[builtins.str, ...] +│ ├── 📄 k8s_test.yaml (140 tokens, 37 lines) +│ │ ├── apps/v1.Deployment -> my-app +│ │ ├── v1.Service -> my-service +│ │ └── v1.ConfigMap -> my-config +│ ├── 📄 Makefile (714 tokens, 84 lines) +│ │ ├── include dotenv/dev.env +│ │ ├── .PHONY: dev +│ │ ├── dev +│ │ ├── services-down +│ │ ├── services-stop: services-down +│ │ ├── define CHECK_POSTGRES +│ │ ├── damage-report +│ │ ├── tail-logs +│ │ └── cloud +│ ├── 📄 requirements_test.txt (29 tokens, 10 lines) +│ │ ├── psycopg2-binary +│ │ ├── pytest +│ │ ├── coverage +│ │ ├── flask[async] +│ │ ├── flask_cors +│ │ ├── stripe +│ │ ├── pyjwt[crypto] +│ │ ├── cognitojwt[async] +│ │ └── flask-lambda +│ ├── 📄 rust_todo_test.rs (92 tokens, 26 lines) +│ │ ├── TODO: This todo tests parse_todo +│ │ ├── enum Color { +│ │ │ Red, +│ │ │ Blue, +│ │ │ Green, +│ │ │ } +│ │ ├── struct Point +│ │ ├── trait Drawable +│ │ ├── fn draw(&self) +│ │ ├── impl Drawable for Point +│ │ ├── fn draw(&self) +│ │ └── fn main() +│ ├── 📄 sql_test.sql (270 tokens, 51 lines) +│ │ ├── CREATE TABLE promoters +│ │ ├── user_id serial PRIMARY KEY, +│ │ ├── type varchar(20) NOT NULL, +│ │ ├── username varchar(20) NOT NULL, +│ │ ├── password varchar(20) NOT NULL, +│ │ ├── email varchar(30) NOT NULL, +│ │ ├── phone varchar(20) NOT NULL, +│ │ ├── promocode varchar(20), +│ │ ├── info json, +│ │ ├── going text[], +│ │ ├── invites text[], +│ │ ├── balance integer NOT NULL, +│ │ ├── rewards text[], +│ │ ├── created timestamp +│ │ ├── CREATE TABLE events +│ │ ├── event_id serial PRIMARY KEY, +│ │ ├── name varchar(64) NOT NULL, +│ │ ├── date varchar(64) NOT NULL, +│ │ ├── location varchar(64) NOT NULL, +│ │ ├── performer varchar(64) NOT NULL, +│ │ ├── rewards json, +│ │ └── created timestamp +│ ├── 📄 standard-app-routing.module.ts (100 tokens, 16 lines) +│ │ └── const routes: Routes = [ +│ │ { path: '', component: HomeComponent }, +│ │ { +│ │ path: 'heroes', +│ │ component: HeroesListComponent, +│ │ children: [ +│ │ { path: ':id', component: HeroDetailComponent }, +│ │ { path: 'new', component: HeroFormComponent }, +│ │ ], +│ │ }, +│ │ { path: '**', component: PageNotFoundComponent }, +│ │ ]; +│ ├── 📄 test.env (190 tokens, 25 lines) +│ │ ├── PROMO_PATH +│ │ ├── PRODUCTION +│ │ ├── SQL_SCHEMA_PATH +│ │ ├── DB_LOGS +│ │ ├── DB_LOG +│ │ ├── PGPASSWORD +│ │ ├── PGDATABASE +│ │ ├── PGHOST +│ │ ├── PGPORT +│ │ ├── PGUSER +│ │ ├── SERVER_LOG +│ │ ├── SERVER_LOGS +│ │ ├── API_URL +│ │ ├── APP_LOGS +│ │ ├── APP_LOG +│ │ ├── APP_URL +│ │ ├── COGNITO_USER_POOL_ID +│ │ ├── COGNITO_APP_CLIENT_ID +│ │ ├── AWS_REGION +│ │ └── STRIPE_SECRET_KEY +│ ├── 📄 testJsonSchema.json (421 tokens, 48 lines) +│ │ ├── $schema: http://json-schema.org/draft-07/schema# +│ │ ├── type: object +│ │ ├── title: random_test +│ │ └── description: A promoter's activites related to events +│ ├── 📄 testPackage.json (349 tokens, 43 lines) +│ │ ├── name: 'promo-app' +│ │ ├── version: 0.0.0 +│ │ ├── scripts: +│ │ ├── ng: 'ng' +│ │ ├── start: 'ng serve' +│ │ ├── build: 'ng build' +│ │ ├── watch: 'ng build --watch --configuration development' +│ │ └── test: 'ng test' +│ └── 📄 tickets.component.ts (7,160 tokens, 903 lines) +│ ├── interface EnrichedTicket extends Ticket +│ ├── interface SpinConfig +│ ├── interface RotationState +│ ├── interface SpeakInput +│ ├── const formatSpeakInput = (input: SpeakInput): string => +│ ├── function hourToSpeech(hour: number, minute: number, period: string): +│ │ string +│ ├── export class TicketsComponent implements AfterViewInit +│ ├── speak(input: SpeakInput) +│ ├── speakEvent(ticket: EnrichedTicket): void +│ ├── formatEvent(ticket: EnrichedTicket): string +│ ├── speakVenue(ticket: EnrichedTicket): void +│ ├── formatDate(date: Date, oneLiner: boolean = false): string +│ ├── formatDateForSpeech(date: Date): string +│ ├── async spinQRCode( +│ │ event: PointerEvent, +│ │ config: SpinConfig = DEFAULT_SPIN_CONFIG +│ │ ) +│ ├── private animateRotation( +│ │ imgElement: HTMLElement, +│ │ targetRotation: number, +│ │ config: SpinConfig, +│ │ cleanup: () => void +│ │ ) +│ ├── const animate = (currentTime: number) => +│ ├── requestAnimationFrame(animate) +│ ├── cleanup() +│ ├── requestAnimationFrame(animate) +│ ├── private getNext90Degree(currentRotation: number): number +│ ├── private getCurrentRotation(matrix: string): number +│ ├── ngAfterViewInit() +│ ├── const mouseEnterListener = () => +│ ├── const mouseLeaveListener = () => +│ ├── ngOnDestroy() +│ ├── toggleColumn(event: MatOptionSelectionChange, column: string) +│ ├── adjustColumns(event?: Event) +│ ├── onResize(event: Event) +│ ├── async ngOnInit() +│ ├── async loadTickets(): Promise +│ ├── onDateRangeChange( +│ │ type: "start" | "end", +│ │ event: MatDatepickerInputEvent +│ │ ) +│ ├── applyFilter(column: string): void +│ ├── formatDateForComparison(date: Date): string +│ ├── constructor(private renderer: Renderer2) +│ ├── onFilterChange(event: Event, column: string) +│ ├── onLatitudeChange(event: Event) +│ ├── onLongitudeChange(event: Event) +│ ├── onRadiusChange(event: Event) +│ ├── sortData(sort: Sort): void +│ ├── onRowClick(event: Event, row: any) +│ ├── function isDate(value: Date | undefined | null): value is Date +│ ├── function isNonNullNumber(value: number | null): value is number +│ ├── function hasLocation( +│ │ ticket: any +│ │ ): ticket is +│ ├── const create_faker_ticket = async () => +│ ├── function compare(a: number | string, b: number | string, isAsc: +│ │ boolean) +│ ├── function compare_dates(a: Date, b: Date, isAsc: boolean) +│ ├── async function mockMoreTickets(): Promise +│ ├── const mockTickets = async () => +│ └── const renderQRCode = async (text: String): Promise => +├── 📁 group6 (1 folder, 14 files) +│ ├── 📄 catastrophic.c (5,339 tokens, 754 lines) +│ │ ├── TODO: technically we should use a proper parser +│ │ ├── struct Point +│ │ ├── int x; +│ │ ├── int y; +│ │ ├── struct Point getOrigin() +│ │ ├── float mul_two_floats(float x1, float x2) +│ │ ├── enum days +│ │ ├── SUN, +│ │ ├── MON, +│ │ ├── TUE, +│ │ ├── WED, +│ │ ├── THU, +│ │ ├── FRI, +│ │ ├── SAT +│ │ ├── enum worker_pool_flags +│ │ ├── POOL_BH = 1 << 0, +│ │ ├── POOL_MANAGER_ACTIVE = 1 << 1, +│ │ ├── POOL_DISASSOCIATED = 1 << 2, +│ │ ├── POOL_BH_DRAINING = 1 << 3, +│ │ ├── enum worker_flags +│ │ ├── WORKER_DIE = 1 << 1, +│ │ ├── WORKER_IDLE = 1 << 2, +│ │ ├── WORKER_PREP = 1 << 3, +│ │ ├── WORKER_CPU_INTENSIVE = 1 << 6, +│ │ ├── WORKER_UNBOUND = 1 << 7, +│ │ ├── WORKER_REBOUND = 1 << 8, +│ │ ├── WORKER_NOT_RUNNING = WORKER_PREP | WORKER_CPU_INTENSIVE +│ │ │ | +│ │ │ WORKER_UNBOUND | WORKER_REBOUND, +│ │ ├── struct worker_pool +│ │ ├── raw_spinlock_t lock; +│ │ ├── int cpu; +│ │ ├── int node; +│ │ ├── int id; +│ │ ├── unsigned int flags; +│ │ ├── unsigned long watchdog_ts; +│ │ ├── bool cpu_stall; +│ │ ├── int nr_running; +│ │ ├── struct list_head worklist; +│ │ ├── int nr_workers; +│ │ ├── int nr_idle; +│ │ ├── struct list_head idle_list; +│ │ ├── struct timer_list idle_timer; +│ │ ├── struct work_struct idle_cull_work; +│ │ ├── struct timer_list mayday_timer; +│ │ ├── struct worker *manager; +│ │ ├── struct list_head workers; +│ │ ├── struct ida worker_ida; +│ │ ├── struct workqueue_attrs *attrs; +│ │ ├── struct hlist_node hash_node; +│ │ ├── int refcnt; +│ │ ├── struct rcu_head rcu; +│ │ ├── long add_two_longs(long x1, long x2) +│ │ ├── double multiplyByTwo(double num) +│ │ ├── char getFirstCharacter(char *str) +│ │ ├── void greet(Person p) +│ │ ├── typedef struct +│ │ ├── char name[50]; +│ │ ├── } Person; +│ │ ├── typedef struct PersonA +│ │ ├── char name[50]; +│ │ ├── } PersonB; +│ │ ├── int main() +│ │ ├── int* getArrayStart(int arr[], int size) +│ │ ├── long complexFunctionWithMultipleArguments( +│ │ │ int param1, +│ │ │ double param2, +│ │ │ char *param3, +│ │ │ struct Point point +│ │ │ ) +│ │ ├── keyPattern *ACLKeyPatternCreate(sds pattern, int flags) +│ │ ├── sds sdsCatPatternString(sds base, keyPattern *pat) +│ │ ├── static int ACLCheckChannelAgainstList(list *reference, const char +│ │ │ *channel, int channellen, int is_pattern) +│ │ ├── while((ln = listNext(&li))) +│ │ ├── static struct config +│ │ ├── aeEventLoop *el; +│ │ ├── cliConnInfo conn_info; +│ │ ├── const char *hostsocket; +│ │ ├── int tls; +│ │ ├── struct cliSSLconfig sslconfig; +│ │ ├── } config; +│ │ ├── class Person +│ │ ├── std::string name; +│ │ ├── public: +│ │ ├── Person(std::string n) : name(n) +│ │ ├── void greet() +│ │ ├── void globalGreet() +│ │ ├── int main() +│ │ ├── void printMessage(const std::string &message) +│ │ ├── template +│ │ │ void printVector(const std::vector& vec) +│ │ ├── struct foo +│ │ ├── char x; +│ │ ├── struct foo_in +│ │ ├── char* y; +│ │ ├── short z; +│ │ ├── } inner; +│ │ ├── struct Point +│ │ ├── int x, y; +│ │ ├── Point(int x, int y) : x(x), y(y) +│ │ ├── class Animal +│ │ ├── public: +│ │ ├── Animal(const std::string &name) : name(name) +│ │ ├── virtual void speak() const +│ │ ├── virtual ~Animal() +│ │ ├── protected: +│ │ ├── std::string name; +│ │ ├── class Dog : public Animal +│ │ ├── public: +│ │ ├── Dog(const std::string &name) : Animal(name) +│ │ ├── void speak() const override +│ │ ├── class Cat : public Animal +│ │ ├── public: +│ │ ├── Cat(const std::string &name) : Animal(name) +│ │ ├── void speak() const override +│ │ ├── class CatDog: public Animal, public Cat, public Dog +│ │ ├── public: +│ │ ├── CatDog(const std::string &name) : Animal(name) +│ │ ├── int meow_bark() +│ │ ├── nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int +│ │ │ num_layers, +│ │ │ int batch_size, int max_seq_length, +│ │ │ float dropout, +│ │ │ bool bidirectional, bool +│ │ │ cudnn_allow_tf32, +│ │ │ int workspace_size, int +│ │ │ reserve_space_size) +│ │ ├── int main() +│ │ ├── enum ECarTypes +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV, +│ │ ├── Wagon +│ │ ├── ECarTypes GetPreferredCarType() +│ │ ├── enum ECarTypes : uint8_t +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV = 254, +│ │ ├── Hybrid +│ │ ├── enum class ECarTypes : uint8_t +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV = 254, +│ │ ├── Hybrid +│ │ ├── void myFunction(string fname, int age) +│ │ ├── template T cos(T) +│ │ ├── template T sin(T) +│ │ ├── template T sqrt(T) +│ │ ├── template struct VLEN +│ │ ├── template class arr +│ │ ├── private: +│ │ ├── static T *ralloc(size_t num) +│ │ ├── static void dealloc(T *ptr) +│ │ ├── static T *ralloc(size_t num) +│ │ ├── static void dealloc(T *ptr) +│ │ ├── public: +│ │ ├── arr() : p(0), sz(0) +│ │ ├── arr(size_t n) : p(ralloc(n)), sz(n) +│ │ ├── arr(arr &&other) +│ │ │ : p(other.p), sz(other.sz) +│ │ ├── ~arr() +│ │ ├── void resize(size_t n) +│ │ ├── T &operator[](size_t idx) +│ │ ├── T *data() +│ │ ├── size_t size() const +│ │ ├── class Buffer +│ │ ├── private: +│ │ ├── void* ptr_; +│ │ ├── std::tuple quantize( +│ │ │ const array& w, +│ │ │ int group_size, +│ │ │ int bits, +│ │ │ StreamOrDevice s) +│ │ ├── #define PY_SSIZE_T_CLEAN +│ │ ├── #define PLATFORM_IS_X86 +│ │ ├── #define PLATFORM_WINDOWS +│ │ ├── #define GETCPUID(a, b, c, d, a_inp, c_inp) +│ │ ├── static int GetXCR0EAX() +│ │ ├── #define GETCPUID(a, b, c, d, a_inp, c_inp) +│ │ ├── static int GetXCR0EAX() +│ │ ├── asm("XGETBV" : "=a"(eax), "=d"(edx) : "c"(0)) +│ │ ├── static void ReportMissingCpuFeature(const char* name) +│ │ ├── static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args) +│ │ ├── static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args) +│ │ ├── static PyMethodDef cpu_feature_guard_methods[] +│ │ ├── static struct PyModuleDef cpu_feature_guard_module +│ │ ├── #define EXPORT_SYMBOL __declspec(dllexport) +│ │ ├── #define EXPORT_SYMBOL __attribute__ ((visibility("default"))) +│ │ ├── EXPORT_SYMBOL PyMODINIT_FUNC PyInit_cpu_feature_guard(void) +│ │ ├── typedef struct +│ │ ├── GPT2Config config; +│ │ ├── ParameterTensors params; +│ │ ├── size_t param_sizes[NUM_PARAMETER_TENSORS]; +│ │ ├── float* params_memory; +│ │ ├── size_t num_parameters; +│ │ ├── ParameterTensors grads; +│ │ ├── float* grads_memory; +│ │ ├── float* m_memory; +│ │ ├── float* v_memory; +│ │ ├── ActivationTensors acts; +│ │ ├── size_t act_sizes[NUM_ACTIVATION_TENSORS]; +│ │ ├── float* acts_memory; +│ │ ├── size_t num_activations; +│ │ ├── ActivationTensors grads_acts; +│ │ ├── float* grads_acts_memory; +│ │ ├── int batch_size; +│ │ ├── int seq_len; +│ │ ├── int* inputs; +│ │ ├── int* targets; +│ │ ├── float mean_loss; +│ │ └── } GPT2; +│ ├── 📄 cpp_examples_impl.cc (60 tokens, 10 lines) +│ │ ├── PYBIND11_MODULE(cpp_examples, m) +│ │ └── m.def("add", &add, "An example function to add two +│ │ numbers.") +│ ├── 📄 cpp_examples_impl.cu (37 tokens, 10 lines) +│ │ ├── template +│ │ │ T add(T a, T b) +│ │ └── template <> +│ │ int add(int a, int b) +│ ├── 📄 cpp_examples_impl.h (22 tokens, 6 lines) +│ │ ├── template +│ │ │ T add(T a, T b) +│ │ └── template <> +│ │ int add(int, int) +│ ├── 📄 edge_case.hpp (426 tokens, 28 lines) +│ ├── 📄 fractal.thy (1,712 tokens, 147 lines) +│ │ ├── Title: fractal.thy +│ │ ├── Author: Isabelle/HOL Contributors! +│ │ ├── Author: edge cases r us +│ │ ├── theory Simplified_Ring +│ │ ├── section ‹Basic Algebraic Structures› +│ │ ├── class everything = nothing + itself +│ │ ├── subsection ‹Monoids› +│ │ ├── definition ring_hom :: "[('a, 'm) ring_scheme, ('b, 'n) ring_scheme] +│ │ │ => ('a => 'b) set" +│ │ ├── fun example_fun :: "nat ⇒ nat" +│ │ ├── locale monoid = +│ │ │ fixes G (structure) +│ │ │ assumes m_closed: "⟦x ∈ carrier G; y ∈ carrier G⟧ ⟹ x ⊗ y ∈ +│ │ │ carrier G" +│ │ │ and m_assoc: "⟦x ∈ carrier G; y ∈ carrier G; z ∈ carrier G⟧ ⟹ +│ │ │ (x ⊗ y) ⊗ z = x ⊗ (y ⊗ z)" +│ │ │ and one_closed: "𝟭 ∈ carrier G" +│ │ │ and l_one: "x ∈ carrier G ⟹ 𝟭 ⊗ x = x" +│ │ │ and r_one: "x ∈ carrier G ⟹ x ⊗ 𝟭 = x" +│ │ ├── subsection ‹Groups› +│ │ ├── locale group = monoid + +│ │ │ assumes Units_closed: "x ∈ Units G ⟹ x ∈ carrier G" +│ │ │ and l_inv_ex: "x ∈ carrier G ⟹ ∃ y ∈ carrier G. y ⊗ x = 𝟭" +│ │ │ and r_inv_ex: "x ∈ carrier G ⟹ ∃ y ∈ carrier G. x ⊗ y = 𝟭" +│ │ ├── subsection ‹Rings› +│ │ ├── locale ring = abelian_group R + monoid R + +│ │ │ assumes l_distr: "⟦x ∈ carrier R; y ∈ carrier R; z ∈ carrier R⟧ ⟹ +│ │ │ (x ⊕ y) ⊗ z = x ⊗ z ⊕ y ⊗ z" +│ │ │ and r_distr: "⟦x ∈ carrier R; y ∈ carrier R; z ∈ carrier R⟧ ⟹ z +│ │ │ ⊗ (x ⊕ y) = z ⊗ x ⊕ z ⊗ y" +│ │ ├── locale commutative_ring = ring + +│ │ │ assumes m_commutative: "⟦x ∈ carrier R; y ∈ carrier R⟧ ⟹ x ⊗ y = +│ │ │ y ⊗ x" +│ │ ├── locale domain = commutative_ring + +│ │ │ assumes no_zero_divisors: "⟦a ⊗ b = 𝟬; a ∈ carrier R; b ∈ carrier +│ │ │ R⟧ ⟹ a = 𝟬 ∨ b = 𝟬" +│ │ ├── locale field = domain + +│ │ │ assumes inv_ex: "x ∈ carrier R - {𝟬} ⟹ inv x ∈ carrier R" +│ │ ├── subsection ‹Morphisms› +│ │ ├── lemma example_lemma: "example_fun n = n" +│ │ ├── qualified lemma gcd_0: +│ │ │ "gcd a 0 = normalize a" +│ │ ├── lemma abelian_monoidI: +│ │ │ fixes R (structure) +│ │ │ and f :: "'edge::{} ⇒ 'case::{}" +│ │ │ assumes "⋀x y. ⟦ x ∈ carrier R; y ∈ carrier R ⟧ ⟹ x ⊕ y ∈ carrier +│ │ │ R" +│ │ │ and "𝟬 ∈ carrier R" +│ │ │ and "⋀x y z. ⟦ x ∈ carrier R; y ∈ carrier R; z ∈ carrier R ⟧ ⟹ +│ │ │ (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z)" +│ │ │ shows "abelian_monoid R" +│ │ ├── lemma euclidean_size_gcd_le1 [simp]: +│ │ │ assumes "a ≠ 0" +│ │ │ shows "euclidean_size (gcd a b) ≤ euclidean_size a" +│ │ ├── theorem Residue_theorem: +│ │ │ fixes S pts::"complex set" and f::"complex ⇒ complex" +│ │ │ and g::"real ⇒ complex" +│ │ │ assumes "open S" "connected S" "finite pts" and +│ │ │ holo:"f holomorphic_on S-pts" and +│ │ │ "valid_path g" and +│ │ │ loop:"pathfinish g = pathstart g" and +│ │ │ "path_image g ⊆ S-pts" and +│ │ │ homo:"∀z. (z ∉ S) ⟶ winding_number g z = 0" +│ │ │ shows "contour_integral g f = 2 * pi * 𝗂 *(∑p ∈ pts. +│ │ │ winding_number g p * residue f p)" +│ │ ├── corollary fps_coeff_residues_bigo': +│ │ │ fixes f :: "complex ⇒ complex" and r :: real +│ │ │ assumes exp: "f has_fps_expansion F" +│ │ │ assumes "open A" "connected A" "cball 0 r ⊆ A" "r > 0" +│ │ │ assumes "f holomorphic_on A - S" "S ⊆ ball 0 r" "finite S" "0 ∉ S" +│ │ │ assumes "eventually (λn. g n = -(∑z ∈ S. residue (λz. f z / z ^ +│ │ │ Suc n) z)) sequentially" +│ │ │ (is "eventually (λn. _ = -?g' n) _") +│ │ │ shows "(λn. fps_nth F n - g n) ∈ O(λn. 1 / r ^ n)" (is "(λn. ?c +│ │ │ n - _) ∈ O(_)") +│ │ └── end +│ ├── 📄 Microsoft.PowerShell_profile.ps1 (3,346 tokens, 497 lines) +│ │ ├── function Log($message) +│ │ ├── function Remove-ChocolateyFromPath +│ │ ├── function Show-Profiles +│ │ ├── function Show-Path +│ │ ├── function Show-Error($err) +│ │ ├── function Get-ScoopPackagePath +│ │ ├── param( +│ │ │ [Parameter(Mandatory = $true)] +│ │ │ [string]$PackageName) +│ │ ├── function Check-Command +│ │ ├── param( +│ │ │ [Parameter(Mandatory = $true)] +│ │ │ [string]$Name) +│ │ ├── function Add-ToPath +│ │ ├── param( +│ │ │ [Parameter(Mandatory = $true)] +│ │ │ [string]$PathToAdd) +│ │ ├── function Install-Scoop +│ │ ├── function Scoop-Install +│ │ ├── param( +│ │ │ [Parameter(Mandatory = $true)] +│ │ │ [string]$Name, +│ │ │ [string]$PathToAdd) +│ │ ├── function Start-CondaEnv +│ │ ├── function Install-PipPackage +│ │ ├── param( +│ │ │ [Parameter(Mandatory = $true)] +│ │ │ [string]$PackageName) +│ │ ├── function Install-VSBuildTools +│ │ ├── function Install-Crate +│ │ ├── param( +│ │ │ [Parameter(Mandatory = $true)] +│ │ │ [string]$CrateName) +│ │ ├── function Get-ScoopVersion +│ │ ├── function Get-Version +│ │ ├── param( +│ │ │ [Parameter(Mandatory = $true)] +│ │ │ [string]$ExecutablePath, +│ │ │ [string]$ExecutableName) +│ │ ├── function Show-Requirements +│ │ ├── function Measure-Status +│ │ ├── param( +│ │ │ [Parameter(Mandatory = $true)] +│ │ │ [string]$Name) +│ │ ├── function Find-Profile +│ │ ├── function Edit-Profile +│ │ ├── function Set-Profile +│ │ └── function Show-Profile +│ ├── 📄 python_complex_class.py (10 tokens, 2 lines) +│ │ └── class Box(Space[NDArray[Any]]) +│ ├── 📄 ramda__cloneRegExp.js (173 tokens, 9 lines) +│ │ └── export default function _cloneRegExp(pattern) +│ ├── 📄 ramda_prop.js (646 tokens, 85 lines) +│ │ ├── /** +│ │ │ * Returns a function that when supplied an object returns the +│ │ │ indicated +│ │ │ * property of that object, if it exists. +│ │ │ * @category Object +│ │ │ * @typedefn Idx = String | Int | Symbol +│ │ │ * @sig Idx -> {s: a} -> a | Undefined +│ │ │ * @param {String|Number} p The property name or array index +│ │ │ * @param {Object} obj The object to query +│ │ │ * @return {*} The value at `obj.p`. +│ │ │ */ +│ │ │ var prop = _curry2(function prop(p, obj) +│ │ ├── /** +│ │ │ * Solves equations of the form a * x = b +│ │ │ * @param {{ +│ │ │ * z: number +│ │ │ * }} x +│ │ │ */ +│ │ │ function foo(x) +│ │ ├── /** +│ │ │ * Deconstructs an array field from the input documents to output a +│ │ │ document for each element. +│ │ │ * Each output document is the input document with the value of the +│ │ │ array field replaced by the element. +│ │ │ * @category Object +│ │ │ * @sig String -> {k: [v]} -> [{k: v}] +│ │ │ * @param {String} key The key to determine which property of the +│ │ │ object should be unwound. +│ │ │ * @param {Object} object The object containing the list to unwind +│ │ │ at the property named by the key. +│ │ │ * @return {List} A list of new objects, each having the given key +│ │ │ associated to an item from the unwound list. +│ │ │ */ +│ │ │ var unwind = _curry2(function(key, object) +│ │ └── return _map(function(item) +│ ├── 📄 tensorflow_flags.h (7,628 tokens, 668 lines) +│ │ ├── TF_DECLARE_FLAG('test_only_experiment_1') +│ │ ├── TF_DECLARE_FLAG('test_only_experiment_2') +│ │ ├── TF_DECLARE_FLAG('enable_nested_function_shape_inference'): +│ │ │ Allow ops such as tf.cond to invoke the ShapeRefiner on +│ │ │ their nested functions. +│ │ ├── TF_DECLARE_FLAG('enable_quantized_dtypes_training'): +│ │ │ Set quantized dtypes, like tf.qint8, to be trainable. +│ │ ├── TF_DECLARE_FLAG('graph_building_optimization'): +│ │ │ Optimize graph building for faster tf.function tracing. +│ │ ├── TF_DECLARE_FLAG('saved_model_fingerprinting'): +│ │ │ Add fingerprint to SavedModels. +│ │ ├── TF_DECLARE_FLAG('more_stack_traces'): +│ │ │ Enable experimental code that preserves and propagates graph +│ │ │ node stack traces in C++. +│ │ ├── TF_DECLARE_FLAG('publish_function_graphs'): +│ │ │ Enables the publication of partitioned function graphs via +│ │ │ StatsPublisherInterface. Disabling this flag can reduce memory +│ │ │ consumption. +│ │ ├── TF_DECLARE_FLAG('enable_aggressive_constant_replication'): +│ │ │ Replicate constants across CPU devices and even for local +│ │ │ CPUs within the same task if available. +│ │ ├── TF_DECLARE_FLAG('enable_colocation_key_propagation_in_while_op_lower +│ │ │ ing'): +│ │ │ If true, colocation key attributes for the ops will be +│ │ │ propagated during while op lowering to switch/merge ops. +│ │ ├── Flag('tf_xla_auto_jit'): +│ │ │ Control compilation of operators into XLA computations on +│ │ │ CPU and GPU devices. 0 = use ConfigProto setting; -1 = off; 1 = on +│ │ │ for things very likely to be improved; 2 = on for everything; +│ │ │ (experimental) fusible = only for Tensorflow operations that XLA +│ │ │ knows how to fuse. If set to single-gpu() then this resolves to +│ │ │ for single-GPU graphs (graphs that have at least one node placed +│ │ │ on a GPU and no more than one GPU is in use through the entire +│ │ │ graph) and 0 otherwise. Experimental. +│ │ ├── Flag('tf_xla_min_cluster_size'): +│ │ │ Minimum number of operators in an XLA compilation. Ignored +│ │ │ for operators placed on an XLA device or operators explicitly marked +│ │ │ for compilation. +│ │ ├── Flag('tf_xla_max_cluster_size'): +│ │ │ Maximum number of operators in an XLA compilation. +│ │ ├── Flag('tf_xla_cluster_exclude_ops'): +│ │ │ (experimental) Exclude the operations from auto-clustering. +│ │ │ If multiple, separate them with commas. Where, Some_other_ops. +│ │ ├── Flag('tf_xla_clustering_debug'): +│ │ │ Dump graphs during XLA compilation. +│ │ ├── Flag('tf_xla_cpu_global_jit'): +│ │ │ Enables global JIT compilation for CPU via SessionOptions. +│ │ ├── Flag('tf_xla_clustering_fuel'): +│ │ │ Places an artificial limit on the number of ops marked as +│ │ │ eligible for clustering. +│ │ ├── Flag('tf_xla_disable_deadness_safety_checks_for_debugging'): +│ │ │ Disable deadness related safety checks when clustering (this +│ │ │ is unsound). +│ │ ├── Flag('tf_xla_disable_resource_variable_safety_checks_for_debugging') +│ │ │ : +│ │ │ Disable resource variables related safety checks when +│ │ │ clustering (this is unsound). +│ │ ├── Flag('tf_xla_deterministic_cluster_names'): +│ │ │ Causes the function names assigned by auto clustering to be +│ │ │ deterministic from run to run. +│ │ ├── Flag('tf_xla_persistent_cache_directory'): +│ │ │ If non-empty, JIT-compiled executables are saved to and +│ │ │ loaded from the specified file system directory path. Empty by +│ │ │ default. +│ │ ├── Flag('tf_xla_persistent_cache_device_types'): +│ │ │ If non-empty, the persistent cache will only be used for the +│ │ │ specified devices (comma separated). Each device type should be able +│ │ │ to be converted to. +│ │ ├── Flag('tf_xla_persistent_cache_read_only'): +│ │ │ If true, the persistent cache will be read-only. +│ │ ├── Flag('tf_xla_disable_strict_signature_checks'): +│ │ │ If true, entires loaded into the XLA compile cache will not +│ │ │ have their signatures checked strictly. Defaults to false. +│ │ ├── Flag('tf_xla_persistent_cache_prefix'): +│ │ │ Specifies the persistance cache prefix. Default is. +│ │ ├── Flag('tf_xla_sparse_core_disable_table_stacking'): +│ │ │ Disable table stacking for all the tables passed to the +│ │ │ SparseCore mid level API. +│ │ ├── Flag('tf_xla_sparse_core_minibatch_max_division_level'): +│ │ │ Max level of division to split input data into minibatches. +│ │ ├── Flag('tf_xla_sparse_core_stacking_mem_limit_bytes'): +│ │ │ If non-zero, limits the size of the activations for a given +│ │ │ table to be below these many bytes. +│ │ ├── Flag('tf_xla_sparse_core_stacking_table_shard_limit_bytes'): +│ │ │ If non-zero, limits the size of any table shard to be below +│ │ │ these many bytes. +│ │ ├── Flag('always_specialize') +│ │ ├── Flag('cost_driven_async_parallel_for') +│ │ ├── Flag('enable_crash_reproducer') +│ │ ├── Flag('log_query_of_death') +│ │ ├── Flag('vectorize') +│ │ ├── Flag('tf_xla_enable_lazy_compilation') +│ │ ├── Flag('tf_xla_print_cluster_outputs'): +│ │ │ If true then insert Print nodes to print out values produced +│ │ │ by XLA clusters. +│ │ ├── Flag('tf_xla_check_cluster_input_numerics'): +│ │ │ If true then insert CheckNumerics nodes to check all cluster +│ │ │ inputs. +│ │ ├── Flag('tf_xla_check_cluster_output_numerics'): +│ │ │ If true then insert CheckNumerics nodes to check all cluster +│ │ │ outputs. +│ │ ├── Flag('tf_xla_disable_constant_folding'): +│ │ │ If true then disables constant folding on TF graph before +│ │ │ XLA compilation. +│ │ ├── Flag('tf_xla_disable_full_embedding_pipelining'): +│ │ │ If true then disables full embedding pipelining and instead +│ │ │ use strict SparseCore / TensorCore sequencing. +│ │ ├── Flag('tf_xla_embedding_parallel_iterations'): +│ │ │ If >0 then use this many parallel iterations in +│ │ │ embedding_pipelining and embedding_sequency. By default, use the +│ │ │ parallel_iterations on the original model WhileOp. +│ │ ├── Flag('tf_xla_compile_on_demand'): +│ │ │ Switch a device into 'on-demand' mode, where instead of +│ │ │ autoclustering ops are compiled one by one just-in-time. +│ │ ├── Flag('tf_xla_enable_xla_devices'): +│ │ │ Generate XLA_* devices, where placing a computation on such +│ │ │ a device forces compilation by XLA. Deprecated. +│ │ ├── Flag('tf_xla_always_defer_compilation') +│ │ ├── Flag('tf_xla_async_compilation'): +│ │ │ When lazy compilation is enabled, asynchronous compilation +│ │ │ starts the cluster compilation in the background, and the fallback +│ │ │ path is executed until the compilation has finished. +│ │ ├── Flag('tf_xla_use_device_api_for_xla_launch'): +│ │ │ If true, uses Device API (PjRt) for single device +│ │ │ compilation and execution of functions marked for JIT compilation +│ │ │ i.e. jit_compile=True. Defaults to false. +│ │ ├── Flag('tf_xla_use_device_api_for_compile_on_demand'): +│ │ │ If true, uses Device API (PjRt) for compiling and executing +│ │ │ ops one by one in 'on-demand' mode. Defaults to false. +│ │ ├── Flag('tf_xla_use_device_api_for_auto_jit'): +│ │ │ If true, uses Device API (PjRt) for compilation and +│ │ │ execution when auto-clustering is enabled. Defaults to false. +│ │ ├── Flag('tf_xla_use_device_api'): +│ │ │ If true, uses Device API (PjRt) for compilation and +│ │ │ execution of ops one-by-one in 'on-demand' mode, for functions +│ │ │ marked for JIT compilation, or when auto-clustering is enabled. +│ │ │ Defaults to false. +│ │ ├── Flag('tf_xla_enable_device_api_for_gpu'): +│ │ │ If true, uses Device API (PjRt) for TF GPU device. This is a +│ │ │ helper flag so that individual tests can turn on PjRt for GPU +│ │ │ specifically. +│ │ ├── Flag('tf_xla_call_module_disabled_checks'): +│ │ │ A comma-sepated list of directives specifying the safety +│ │ │ checks to be skipped when compiling XlaCallModuleOp. See the op +│ │ │ documentation for the recognized values. +│ │ ├── Flag('tf_mlir_enable_mlir_bridge'): +│ │ │ Enables experimental MLIR-Based TensorFlow Compiler Bridge. +│ │ ├── Flag('tf_mlir_enable_merge_control_flow_pass'): +│ │ │ Enables MergeControlFlow pass for MLIR-Based TensorFlow +│ │ │ Compiler Bridge. +│ │ ├── Flag('tf_mlir_enable_convert_control_to_data_outputs_pass'): +│ │ │ Enables MLIR-Based TensorFlow Compiler Bridge. +│ │ ├── Flag('tf_mlir_enable_strict_clusters'): +│ │ │ Do not allow clusters that have cyclic control dependencies. +│ │ ├── Flag('tf_mlir_enable_multiple_local_cpu_devices'): +│ │ │ Enable multiple local CPU devices. CPU ops which are outside +│ │ │ compiled inside the tpu cluster will also be replicated across +│ │ │ multiple cpu devices. +│ │ ├── Flag('tf_dump_graphs_in_tfg'): +│ │ │ When tf_dump_graphs_in_tfg is true, graphs after +│ │ │ transformations are dumped in MLIR TFG dialect and not in GraphDef. +│ │ ├── Flag('tf_mlir_enable_generic_outside_compilation'): +│ │ │ Enables OutsideCompilation passes for MLIR-Based TensorFlow +│ │ │ Generic Compiler Bridge. +│ │ ├── Flag('tf_mlir_enable_tpu_variable_runtime_reformatting_pass'): +│ │ │ Enables TPUVariableRuntimeReformatting pass for MLIR-Based +│ │ │ TensorFlow Compiler Bridge. This enables weight update sharding and +│ │ │ creates TPUReshardVariables ops. +│ │ ├── TF_PY_DECLARE_FLAG('test_only_experiment_1') +│ │ ├── TF_PY_DECLARE_FLAG('test_only_experiment_2') +│ │ ├── TF_PY_DECLARE_FLAG('enable_nested_function_shape_inference') +│ │ ├── TF_PY_DECLARE_FLAG('enable_quantized_dtypes_training') +│ │ ├── TF_PY_DECLARE_FLAG('graph_building_optimization') +│ │ ├── TF_PY_DECLARE_FLAG('op_building_optimization') +│ │ ├── TF_PY_DECLARE_FLAG('saved_model_fingerprinting') +│ │ ├── TF_PY_DECLARE_FLAG('tf_shape_default_int64') +│ │ ├── TF_PY_DECLARE_FLAG('more_stack_traces') +│ │ ├── TF_PY_DECLARE_FLAG('publish_function_graphs') +│ │ ├── TF_PY_DECLARE_FLAG('enable_aggressive_constant_replication') +│ │ ├── TF_PY_DECLARE_FLAG('enable_colocation_key_propagation_in_while_op_lo +│ │ │ wering') +│ │ ├── #define TENSORFLOW_CORE_CONFIG_FLAG_DEFS_H_ +│ │ ├── class Flags +│ │ ├── public: +│ │ ├── bool SetterForXlaAutoJitFlag(const string& value) +│ │ ├── bool SetterForXlaCallModuleDisabledChecks(const string& value) +│ │ ├── void AppendMarkForCompilationPassFlagsInternal(std::vector* +│ │ │ flag_list) +│ │ ├── void AllocateAndParseJitRtFlags() +│ │ ├── void AllocateAndParseFlags() +│ │ ├── void ResetFlags() +│ │ ├── bool SetXlaAutoJitFlagFromFlagString(const string& value) +│ │ ├── BuildXlaOpsPassFlags* GetBuildXlaOpsPassFlags() +│ │ ├── MarkForCompilationPassFlags* GetMarkForCompilationPassFlags() +│ │ ├── XlaSparseCoreFlags* GetXlaSparseCoreFlags() +│ │ ├── XlaDeviceFlags* GetXlaDeviceFlags() +│ │ ├── XlaOpsCommonFlags* GetXlaOpsCommonFlags() +│ │ ├── XlaCallModuleFlags* GetXlaCallModuleFlags() +│ │ ├── MlirCommonFlags* GetMlirCommonFlags() +│ │ ├── void ResetJitCompilerFlags() +│ │ ├── const JitRtFlags& GetJitRtFlags() +│ │ ├── ConfigProto::Experimental::MlirBridgeRollout +│ │ │ GetMlirBridgeRolloutState( +│ │ │ std::optional config_proto) +│ │ ├── void AppendMarkForCompilationPassFlags(std::vector* flag_list) +│ │ ├── void DisableXlaCompilation() +│ │ ├── void EnableXlaCompilation() +│ │ ├── bool FailOnXlaCompilation() +│ │ ├── #define TF_PY_DECLARE_FLAG(flag_name) +│ │ └── PYBIND11_MODULE(flags_pybind, m) +│ ├── 📄 test.f (181 tokens, 30 lines) +│ │ ├── MODULE basic_mod +│ │ ├── TYPE :: person +│ │ │ CHARACTER(LEN=50) :: name +│ │ │ INTEGER :: age +│ │ │ END TYPE person +│ │ ├── SUBROUTINE short_hello(happy, path) +│ │ │ END SUBROUTINE short_hello +│ │ ├── SUBROUTINE long_hello( +│ │ │ p, +│ │ │ message +│ │ │ ) +│ │ │ END SUBROUTINE long_hello +│ │ ├── END MODULE basic_mod +│ │ └── PROGRAM HelloFortran +│ │ END PROGRAM HelloFortran +│ ├── 📄 torch.rst (60 tokens, 8 lines) +│ │ ├── # libtorch (C++-only) +│ │ └── - Building libtorch using Python +│ └── 📄 yc.html (9,063 tokens, 169 lines) +├── 📁 group7 (1 folder, 5 files) +│ ├── 📄 absurdly_huge.jsonl (8,347 tokens, 126 lines) +│ │ ├── SMILES: str +│ │ ├── Yield: float +│ │ ├── Temperature: int +│ │ ├── Pressure: float +│ │ ├── Solvent: str +│ │ ├── Success: bool +│ │ ├── Reaction_Conditions: dict +│ │ ├── Products: list +│ │ └── EdgeCasesMissed: None +│ ├── 📄 angular_crud.ts (1,192 tokens, 148 lines) +│ │ ├── interface DBCommand +│ │ ├── export class IndexedDbService +│ │ ├── constructor() +│ │ ├── async create_connection({ db_name = 'client_db', table_name }: +│ │ │ DBCommand) +│ │ ├── upgrade(db) +│ │ ├── async create_model({ db_name, table_name, model }: DBCommand) +│ │ ├── verify_matching({ table_name, model }) +│ │ ├── async read_key({ db_name, table_name, key }: DBCommand) +│ │ ├── async update_model({ db_name, table_name, model }: DBCommand) +│ │ ├── verify_matching({ table_name, model }) +│ │ ├── async delete_key({ db_name, table_name, key }: DBCommand) +│ │ ├── async list_table({ +│ │ │ db_name, +│ │ │ table_name, +│ │ │ where, +│ │ │ }: DBCommand & { where?: { [key: string]: string | number } }) +│ │ └── async search_table(criteria: SearchCriteria) +│ ├── 📄 structure.py (400 tokens, 92 lines) +│ │ ├── @runtime_checkable +│ │ │ class DataClass(Protocol) +│ │ ├── __dataclass_fields__: dict +│ │ ├── class MyInteger(Enum) +│ │ ├── ONE = 1 +│ │ ├── TWO = 2 +│ │ ├── THREE = 42 +│ │ ├── class MyString(Enum) +│ │ ├── AAA1 = "aaa" +│ │ ├── BB_B = """edge +│ │ │ case""" +│ │ ├── @dataclass(frozen=True, slots=True, kw_only=True) +│ │ │ class Tool +│ │ ├── name: str +│ │ ├── description: str +│ │ ├── input_model: DataClass +│ │ ├── output_model: DataClass +│ │ ├── def execute(self, *args, **kwargs) +│ │ ├── @property +│ │ │ def edge_case(self) -> str +│ │ ├── def should_still_see_me(self, x: bool = True) -> "Tool" +│ │ ├── @dataclass +│ │ │ class MyInput[T] +│ │ ├── name: str +│ │ ├── rank: MyInteger +│ │ ├── serial_n: int +│ │ ├── @dataclass +│ │ │ class Thingy +│ │ ├── is_edge_case: bool +│ │ ├── @dataclass +│ │ │ class MyOutput +│ │ ├── orders: str +│ │ ├── class MyTools(Enum) +│ │ ├── TOOL_A = Tool( +│ │ │ name="complicated", +│ │ │ description="edge case!", +│ │ │ input_model=MyInput[Thingy], +│ │ │ output_model=MyOutput, +│ │ │ ) +│ │ ├── TOOL_B = Tool( +│ │ │ name="""super +│ │ │ complicated +│ │ │ """, +│ │ │ description="edge case!", +│ │ │ input_model=MyInput, +│ │ │ output_model=MyOutput, +│ │ │ ) +│ │ ├── @final +│ │ │ class dtype(Generic[_DTypeScalar_co]) +│ │ └── names: None | tuple[builtins.str, ...] +│ ├── 📄 test.wgsl (528 tokens, 87 lines) +│ │ ├── alias MyVec = vec4 +│ │ ├── alias AnotherVec = vec2 +│ │ ├── struct VertexInput +│ │ ├── struct VertexOutput +│ │ ├── struct MyUniforms +│ │ ├── @group(0) @binding(0) var u_mvp: mat4x4 +│ │ ├── @group(0) @binding(1) var u_color: MyVec +│ │ ├── @group(1) @binding(0) var my_texture: texture_2d +│ │ ├── @group(1) @binding(1) var my_sampler: sampler +│ │ ├── @vertex +│ │ │ fn vs_main(in: VertexInput) -> VertexOutput +│ │ ├── @fragment +│ │ │ fn fs_main(in: VertexOutput) -> @location(0) vec4 +│ │ ├── @compute @workgroup_size(8, 8, 1) +│ │ │ fn cs_main(@builtin(global_invocation_id) global_id: vec3) +│ │ ├── fn helper_function(val: f32) -> f32 +│ │ ├── struct AnotherStruct +│ │ └── @compute +│ │ @workgroup_size(8, 8, 1) +│ │ fn multi_line_edge_case( +│ │ @builtin(global_invocation_id) +│ │ globalId : vec3, +│ │ @group(1) +│ │ @binding(0) +│ │ srcTexture : texture_2d, +│ │ @group(1) +│ │ @binding(1) +│ │ srcSampler : sampler, +│ │ @group(0) +│ │ @binding(0) +│ │ uniformsPtr : ptr, +│ │ storageBuffer : ptr, 64>, read_write>, +│ │ ) +│ └── 📄 test.metal (272 tokens, 34 lines) +│ ├── struct MyData +│ ├── kernel void myKernel(device MyData* data [[buffer(0)]], +│ │ uint id [[thread_position_in_grid]]) +│ ├── float myHelperFunction(float x, float y) +│ ├── vertex float4 vertexShader(const device packed_float3* vertex_array +│ │ [[buffer(0)]], +│ │ unsigned int vid [[vertex_id]]) +│ ├── fragment half4 fragmentShader(float4 P [[position]]) +│ └── float3 computeNormalMap(ColorInOut in, texture2d +│ normalMapTexture) +├── 📁 group_lisp (1 folder, 4 files) +│ ├── 📄 clojure_test.clj (682 tokens, 85 lines) +│ │ ├── defprotocol P +│ │ ├── defrecord Person +│ │ ├── defn -main +│ │ ├── ns bion.likes_trees +│ │ ├── def repo-url +│ │ ├── defn config +│ │ ├── defmacro with-os +│ │ └── defrecord SetFullElement +│ ├── 📄 LispTest.lisp (25 tokens, 6 lines) +│ │ ├── defstruct person +│ │ └── defun greet +│ ├── 📄 racket_struct.rkt (14 tokens, 1 line) +│ │ └── struct point +│ └── 📄 test_scheme.scm (360 tokens, 44 lines) +│ ├── define topological-sort +│ ├── define table +│ ├── define queue +│ ├── define result +│ ├── define set-up +│ └── define traverse +└── 📁 group_todo (1 folder, 12 files) + ├── 📄 AAPLShaders.metal (5,780 tokens, 566 lines) + │ ├── struct LightingParameters + │ ├── float Geometry(float Ndotv, float alphaG) + │ ├── float3 computeNormalMap(ColorInOut in, texture2d + │ │ normalMapTexture) + │ ├── float3 computeDiffuse(LightingParameters parameters) + │ ├── float Distribution(float NdotH, float roughness) + │ ├── float3 computeSpecular(LightingParameters parameters) + │ ├── float4 equirectangularSample(float3 direction, sampler s, + │ │ texture2d image) + │ ├── LightingParameters calculateParameters(ColorInOut in, + │ │ AAPLCameraData cameraData, + │ │ constant AAPLLightData& + │ │ lightData, + │ │ texture2d + │ │ baseColorMap, + │ │ texture2d normalMap, + │ │ texture2d + │ │ metallicMap, + │ │ texture2d + │ │ roughnessMap, + │ │ texture2d + │ │ ambientOcclusionMap, + │ │ texture2d + │ │ skydomeMap) + │ ├── struct SkyboxVertex + │ ├── struct SkyboxV2F + │ ├── vertex SkyboxV2F skyboxVertex(SkyboxVertex in [[stage_in]], + │ │ constant AAPLCameraData& cameraData + │ │ [[buffer(BufferIndexCameraData)]]) + │ ├── fragment float4 skyboxFragment(SkyboxV2F v [[stage_in]], + │ │ texture2d skytexture [[texture(0)]]) + │ ├── vertex ColorInOut vertexShader(Vertex in [[stage_in]], + │ │ constant AAPLInstanceTransform& + │ │ instanceTransform [[ buffer(BufferIndexInstanceTransforms) ]], + │ │ constant AAPLCameraData& cameraData + │ │ [[ buffer(BufferIndexCameraData) ]]) + │ ├── float2 calculateScreenCoord( float3 ndcpos ) + │ ├── fragment float4 fragmentShader( + │ │ ColorInOut in + │ │ [[stage_in]], + │ │ constant AAPLCameraData& cameraData + │ │ [[ buffer(BufferIndexCameraData) ]], + │ │ constant AAPLLightData& lightData + │ │ [[ buffer(BufferIndexLightData) ]], + │ │ constant AAPLSubmeshKeypath&submeshKeypath + │ │ [[ buffer(BufferIndexSubmeshKeypath)]], + │ │ constant Scene* pScene + │ │ [[ buffer(SceneIndex)]], + │ │ texture2d skydomeMap + │ │ [[ texture(AAPLSkyDomeTexture) ]], + │ │ texture2d rtReflections + │ │ [[ texture(AAPLTextureIndexReflections), + │ │ function_constant(is_raytracing_enabled)]]) + │ ├── fragment float4 reflectionShader(ColorInOut in [[stage_in]], + │ │ texture2d rtReflections + │ │ [[texture(AAPLTextureIndexReflections)]]) + │ ├── struct ThinGBufferOut + │ ├── fragment ThinGBufferOut gBufferFragmentShader(ColorInOut in + │ │ [[stage_in]]) + │ ├── kernel void rtReflection( + │ │ texture2d< float, access::write > outImage + │ │ [[texture(OutImageIndex)]], + │ │ texture2d< float > positions + │ │ [[texture(ThinGBufferPositionIndex)]], + │ │ texture2d< float > directions + │ │ [[texture(ThinGBufferDirectionIndex)]], + │ │ texture2d< float > skydomeMap + │ │ [[texture(AAPLSkyDomeTexture)]], + │ │ constant AAPLInstanceTransform* + │ │ instanceTransforms [[buffer(BufferIndexInstanceTransforms)]], + │ │ constant AAPLCameraData& cameraData + │ │ [[buffer(BufferIndexCameraData)]], + │ │ constant AAPLLightData& lightData + │ │ [[buffer(BufferIndexLightData)]], + │ │ constant Scene* pScene + │ │ [[buffer(SceneIndex)]], + │ │ instance_acceleration_structure + │ │ accelerationStructure [[buffer(AccelerationStructureIndex)]], + │ │ uint2 tid [[thread_position_in_grid]]) + │ ├── else if ( intersection.type == raytracing::intersection_type::none ) + │ ├── struct VertexInOut + │ ├── vertex VertexInOut vertexPassthrough( uint vid [[vertex_id]] ) + │ ├── fragment float4 fragmentPassthrough( VertexInOut in [[stage_in]], + │ │ texture2d< float > tin ) + │ ├── fragment float4 fragmentBloomThreshold( VertexInOut in [[stage_in]], + │ │ texture2d< float > tin + │ │ [[texture(0)]], + │ │ constant float* threshold + │ │ [[buffer(0)]] ) + │ └── fragment float4 fragmentPostprocessMerge( VertexInOut in + │ [[stage_in]], + │ constant float& exposure + │ [[buffer(0)]], + │ texture2d< float > texture0 + │ [[texture(0)]], + │ texture2d< float > texture1 + │ [[texture(1)]]) + ├── 📄 crystal_test.cr (48 tokens, 15 lines) + ├── 📄 dart_test.dart (108 tokens, 24 lines) + ├── 📄 elixir_test.exs (39 tokens, 10 lines) + ├── 📄 forward.frag (739 tokens, 87 lines) + ├── 📄 forward.vert (359 tokens, 48 lines) + ├── 📄 nodemon.json (118 tokens, 20 lines) + ├── 📄 sas_test.sas (97 tokens, 22 lines) + ├── 📄 test_setup_py.test (133 tokens, 24 lines) + ├── 📄 testTypings.d.ts (158 tokens, 23 lines) + ├── 📄 vba_test.bas (67 tokens, 16 lines) + └── 📄 wgsl_test.wgsl (94 tokens, 17 lines) + ├── @binding(0) @group(0) var frame : u32 + ├── @vertex + │ fn vtx_main(@builtin(vertex_index) vertex_index : u32) -> + │ @builtin(position) vec4f + └── @fragment + fn frag_main() -> @location(0) vec4f diff --git a/tests/golden/legacy/trees/more_languages_group1.txt b/tests/golden/legacy/trees/more_languages_group1.txt new file mode 100644 index 0000000..b57238e --- /dev/null +++ b/tests/golden/legacy/trees/more_languages_group1.txt @@ -0,0 +1,374 @@ +📁 group1 (1 folder, 11 files) +├── 📄 addamt.cobol (441 tokens, 40 lines) +│ ├── IDENTIFICATION DIVISION. +│ ├── PROGRAM-ID. +│ │ ADDAMT. +│ ├── DATA DIVISION. +│ ├── WORKING-STORAGE SECTION. +│ ├── 01 KEYED-INPUT. +│ ├── 05 CUST-NO-IN. +│ ├── 05 AMT1-IN. +│ ├── 05 AMT2-IN. +│ ├── 05 AMT3-IN. +│ ├── 01 DISPLAYED-OUTPUT. +│ ├── 05 CUST-NO-OUT. +│ ├── 05 TOTAL-OUT. +│ ├── 01 MORE-DATA. +│ ├── PROCEDURE DIVISION. +│ └── 100-MAIN. +├── 📄 CUSTOMER-INVOICE.CBL (412 tokens, 60 lines) +│ ├── IDENTIFICATION DIVISION. +│ ├── PROGRAM-ID. CUSTOMER-INVOICE. +│ ├── AUTHOR. JANE DOE. +│ ├── DATE. 2023-12-30. +│ ├── DATE-COMPILED. 06/30/10. +│ ├── DATE-WRITTEN. 12/34/56. +│ ├── ENVIRONMENT DIVISION. +│ ├── INPUT-OUTPUT SECTION. +│ ├── FILE-CONTROL. +│ ├── SELECT CUSTOMER-FILE. +│ ├── SELECT INVOICE-FILE. +│ ├── SELECT REPORT-FILE. +│ ├── DATA DIVISION. +│ ├── FILE SECTION. +│ ├── FD CUSTOMER-FILE. +│ ├── 01 CUSTOMER-RECORD. +│ ├── 05 CUSTOMER-ID. +│ ├── 05 CUSTOMER-NAME. +│ ├── 05 CUSTOMER-BALANCE. +│ ├── FD INVOICE-FILE. +│ ├── 01 INVOICE-RECORD. +│ ├── 05 INVOICE-ID. +│ ├── 05 CUSTOMER-ID. +│ ├── 05 INVOICE-AMOUNT. +│ ├── FD REPORT-FILE. +│ ├── 01 REPORT-RECORD. +│ ├── WORKING-STORAGE SECTION. +│ ├── 01 WS-CUSTOMER-FOUND. +│ ├── 01 WS-END-OF-FILE. +│ ├── 01 WS-TOTAL-BALANCE. +│ ├── PROCEDURE DIVISION. +│ ├── 0000-MAIN-ROUTINE. +│ ├── 1000-PROCESS-RECORDS. +│ ├── 1100-UPDATE-CUSTOMER-BALANCE. +│ └── END PROGRAM CUSTOMER-INVOICE. +├── 📄 JavaTest.java (578 tokens, 86 lines) +│ ├── abstract class LivingBeing +│ ├── abstract void breathe() +│ ├── interface Communicator +│ ├── String communicate() +│ ├── @Log +│ ├── @Getter +│ ├── @Setter +│ ├── class Person extends LivingBeing implements Communicator +│ ├── Person(String name, int age) +│ ├── @Override +│ ├── void breathe() +│ ├── @Override +│ ├── public String communicate() +│ ├── void greet() +│ ├── String personalizedGreeting(String greeting, Optional +│ │ includeAge) +│ ├── @Singleton +│ ├── @RestController +│ ├── @SpringBootApplication +│ ├── public class Example +│ ├── @Inject +│ ├── public Example(Person person) +│ ├── @RequestMapping("/greet") +│ ├── String home(@RequestParam(value = "name", defaultValue = "World") +│ │ String name, +│ │ @RequestParam(value = "age", defaultValue = "30") int +│ │ age) +│ └── public static void main(String[] args) +├── 📄 JuliaTest.jl (381 tokens, 63 lines) +│ ├── module JuliaTest_EdgeCase +│ ├── struct Location +│ │ name::String +│ │ lat::Float32 +│ │ lon::Float32 +│ │ end +│ ├── mutable struct mPerson +│ │ name::String +│ │ age::Int +│ │ end +│ ├── Base.@kwdef mutable struct Param +│ │ Δt::Float64 = 0.1 +│ │ n::Int64 +│ │ m::Int64 +│ │ end +│ ├── sic(x,y) +│ ├── welcome(l::Location) +│ ├── ∑(α, Ω) +│ ├── function noob() +│ │ end +│ ├── function ye_olde(hello::String, world::Location) +│ │ end +│ ├── function multiline_greet( +│ │ p::mPerson, +│ │ greeting::String +│ │ ) +│ │ end +│ ├── function julia_is_awesome(prob::DiffEqBase.AbstractDAEProblem{uType, +│ │ duType, tType, +│ │ isinplace}; +│ │ kwargs...) where {uType, duType, tType, isinplace} +│ │ end +│ └── end +├── 📄 KotlinTest.kt (974 tokens, 171 lines) +│ ├── data class Person(val name: String) +│ ├── fun greet(person: Person) +│ ├── fun processItems(items: List, processor: (T) -> Unit) +│ ├── interface Source +│ ├── fun nextT(): T +│ ├── fun MutableList.swap(index1: Int, index2: Int) +│ ├── fun Any?.toString(): String +│ ├── tailrec fun findFixPoint(x: Double = 1.0): Double +│ ├── class GenericRepository +│ ├── fun getItem(id: Int): T? +│ ├── sealed interface Error +│ ├── sealed class IOError(): Error +│ ├── object Runner +│ ├── inline fun , T> run() : T +│ ├── infix fun Int.shl(x: Int): Int +│ ├── class MyStringCollection +│ ├── infix fun add(s: String) +│ ├── fun build() +│ ├── open class Base(p: Int) +│ ├── class Derived(p: Int) : Base(p) +│ ├── open class Shape +│ ├── open fun draw() +│ ├── fun fill() +│ ├── open fun edge(case: Int) +│ ├── interface Thingy +│ ├── fun edge() +│ ├── class Circle() : Shape(), Thingy +│ ├── override fun draw() +│ ├── final override fun edge(case: Int) +│ ├── interface Base +│ ├── fun print() +│ ├── class BaseImpl(val x: Int) : Base +│ ├── override fun print() +│ ├── internal class Derived(b: Base) : Base by b +│ ├── class Person constructor(firstName: String) +│ ├── class People( +│ │ firstNames: Array, +│ │ ages: Array(42), +│ │ ) +│ ├── fun edgeCases(): Boolean +│ ├── class Alien public @Inject constructor( +│ │ val firstName: String, +│ │ val lastName: String, +│ │ var age: Int, +│ │ val pets: MutableList = mutableListOf(), +│ │ ) +│ ├── fun objectOriented(): String +│ ├── enum class IntArithmetics : BinaryOperator, IntBinaryOperator +│ ├── PLUS { +│ │ override fun apply(t: Int, u: Int): Int +│ ├── TIMES { +│ │ override fun apply(t: Int, u: Int): Int +│ ├── override fun applyAsInt(t: Int, u: Int) +│ ├── fun reformat( +│ │ str: String, +│ │ normalizeCase: Boolean = true, +│ │ upperCaseFirstLetter: Boolean = true, +│ │ divideByCamelHumps: Boolean = false, +│ │ wordSeparator: Char = ' ', +│ │ ) +│ ├── operator fun Point.unaryMinus() +│ ├── abstract class Polygon +│ └── abstract fun draw() +├── 📄 lesson.cbl (635 tokens, 78 lines) +│ ├── IDENTIFICATION DIVISION. +│ ├── PROGRAM-ID. CBL0002. +│ ├── AUTHOR. Otto B. Fun. +│ ├── ENVIRONMENT DIVISION. +│ ├── INPUT-OUTPUT SECTION. +│ ├── FILE-CONTROL. +│ ├── SELECT PRINT-LINE. +│ ├── SELECT ACCT-REC. +│ ├── DATA DIVISION. +│ ├── FILE SECTION. +│ ├── FD PRINT-LINE. +│ ├── 01 PRINT-REC. +│ ├── 05 ACCT-NO-O. +│ ├── 05 ACCT-LIMIT-O. +│ ├── 05 ACCT-BALANCE-O. +│ ├── 05 LAST-NAME-O. +│ ├── 05 FIRST-NAME-O. +│ ├── 05 COMMENTS-O. +│ ├── FD ACCT-REC. +│ ├── 01 ACCT-FIELDS. +│ ├── 05 ACCT-NO. +│ ├── 05 ACCT-LIMIT. +│ ├── 05 ACCT-BALANCE. +│ ├── 05 LAST-NAME. +│ ├── 05 FIRST-NAME. +│ ├── 05 CLIENT-ADDR. +│ ├── 10 STREET-ADDR. +│ ├── 10 CITY-COUNTY. +│ ├── 10 USA-STATE. +│ ├── 05 RESERVED. +│ ├── 05 COMMENTS. +│ ├── WORKING-STORAGE SECTION. +│ ├── 01 FLAGS. +│ ├── 05 LASTREC. +│ ├── PROCEDURE DIVISION. +│ ├── OPEN-FILES. +│ ├── READ-NEXT-RECORD. +│ ├── CLOSE-STOP. +│ ├── READ-RECORD. +│ └── WRITE-RECORD. +├── 📄 LuaTest.lua (83 tokens, 16 lines) +│ ├── function HelloWorld.new +│ ├── function HelloWorld.greet +│ └── function say_hello +├── 📄 ObjectiveCTest.m (62 tokens, 16 lines) +│ ├── @interface HelloWorld +│ ├── @interface HelloWorld -> (void) sayHello +│ ├── @implementation HelloWorld +│ ├── @implementation HelloWorld -> (void) sayHello +│ └── void sayHelloWorld() +├── 📄 OcamlTest.ml (49 tokens, 12 lines) +│ ├── type color +│ ├── class hello +│ ├── class hello -> method say_hello +│ └── let main () +├── 📄 test.js (757 tokens, 154 lines) +│ ├── class MyClass +│ ├── myMethod() +│ ├── async asyncMethod(a, b) +│ ├── methodWithDefaultParameters(a = 5, b = 10) +│ ├── multilineMethod( +│ │ c, +│ │ d +│ │ ) +│ ├── multilineMethodWithDefaults( +│ │ t = "tree", +│ │ p = "plus" +│ │ ) +│ ├── function myFunction(param1, param2) +│ ├── function multilineFunction( +│ │ param1, +│ │ param2 +│ │ ) +│ ├── const arrowFunction = () => +│ ├── const parametricArrow = (a, b) => +│ ├── function () +│ ├── function outerFunction(outerParam) +│ ├── function innerFunction(innerParam) +│ ├── innerFunction("inner") +│ ├── const myObject = { +│ ├── myMethod: function (stuff) +│ ├── let myArrowObject = { +│ ├── myArrow: ({ +│ │ a, +│ │ b, +│ │ c, +│ │ }) => +│ ├── const myAsyncArrowFunction = async () => +│ ├── function functionWithRestParameters(...args) +│ ├── const namedFunctionExpression = function myNamedFunction() +│ ├── const multilineArrowFunction = ( +│ │ a, +│ │ b +│ │ ) => +│ ├── function functionReturningFunction() +│ ├── return function () +│ ├── function destructuringOnMultipleLines({ +│ │ a, +│ │ b, +│ │ }) +│ ├── const arrowFunctionWithDestructuring = ({ a, b }) => +│ ├── const multilineDestructuringArrow = ({ +│ │ a, +│ │ b, +│ │ }) => +│ ├── async function asyncFunctionWithErrorHandling() +│ ├── class Car +│ ├── constructor(brand) +│ ├── present() +│ ├── class Model extends Car +│ ├── constructor(brand, mod) +│ ├── super(brand) +│ └── show() +└── 📄 test.ts (832 tokens, 165 lines) + ├── type MyType + ├── interface MyInterface + ├── class TsClass + ├── myMethod() + ├── myMethodWithArgs(param1: string, param2: number): void + ├── static myStaticMethod(param: T): T + ├── multilineMethod( + │ c: number, + │ d: number + │ ): number + ├── multilineMethodWithDefaults( + │ t: string = "tree", + │ p: string = "plus" + │ ): string + ├── export class AdvancedComponent implements MyInterface + ├── async myAsyncMethod( + │ a: string, + │ b: number, + │ c: string + │ ): Promise + ├── genericMethod( + │ arg1: T, + │ arg2: U + │ ): [T, U] + ├── export class TicketsComponent implements MyInterface + ├── async myAsyncMethod({ a, b, c }: { a: String; b: Number; c: String }) + ├── function tsFunction() + ├── function tsFunctionSigned( + │ param1: number, + │ param2: number + │ ): void + ├── export default async function tsFunctionComplicated({ + │ a = 1 | 2, + │ b = "bob", + │ c = async () => "charlie", + │ }: { + │ a: number; + │ b: string; + │ c: () => Promise; + │ }): Promise + ├── return("Standalone function with parameters") + ├── const tsArrowFunctionSigned = ({ + │ a, + │ b, + │ }: { + │ a: number; + │ b: string; + │ }) => + ├── export const tsComplicatedArrow = async ({ + │ a = 1 | 2, + │ b = "bob", + │ c = async () => "charlie", + │ }: { + │ a: number; + │ b: string; + │ c: () => Promise; + │ }): Promise => + ├── const arrowFunction = () => + ├── const arrow = (a: String, b: Number) => + ├── const asyncArrowFunction = async () => + ├── const asyncArrow = async (a: String, b: Number) => + ├── let weirdArrow = () => + ├── const asyncPromiseArrow = async (): Promise => + ├── let myWeirdArrowSigned = (x: number): number => + ├── class Person + ├── constructor(private firstName: string, private lastName: string) + ├── getFullName(): string + ├── describe(): string + ├── class Employee extends Person + ├── constructor( + │ firstName: string, + │ lastName: string, + │ private jobTitle: string + │ ) + ├── super(firstName, lastName) + ├── describe(): string + ├── interface Shape + └── interface Square extends Shape diff --git a/tests/golden/legacy/trees/more_languages_group2.txt b/tests/golden/legacy/trees/more_languages_group2.txt new file mode 100644 index 0000000..0eba148 --- /dev/null +++ b/tests/golden/legacy/trees/more_languages_group2.txt @@ -0,0 +1,135 @@ +📁 group2 (1 folder, 8 files) +├── 📄 apl_test.apl (28 tokens, 5 lines) +│ ├── :Namespace HelloWorld +│ ├── :Namespace HelloWorld -> hello ← 'Hello, World!' +│ └── :Namespace HelloWorld -> plus ← {⍺+⍵} +├── 📄 c_test.c (837 tokens, 142 lines) +│ ├── struct Point +│ ├── int x; +│ ├── int y; +│ ├── struct Point getOrigin() +│ ├── float mul_two_floats(float x1, float x2) +│ ├── enum days +│ ├── SUN, +│ ├── MON, +│ ├── TUE, +│ ├── WED, +│ ├── THU, +│ ├── FRI, +│ ├── SAT +│ ├── long add_two_longs(long x1, long x2) +│ ├── double multiplyByTwo(double num) +│ ├── char getFirstCharacter(char *str) +│ ├── void greet(Person p) +│ ├── typedef struct +│ ├── char name[50]; +│ ├── } Person; +│ ├── int main() +│ ├── int* getArrayStart(int arr[], int size) +│ ├── long complexFunctionWithMultipleArguments( +│ │ int param1, +│ │ double param2, +│ │ char *param3, +│ │ struct Point point +│ │ ) +│ ├── keyPattern *ACLKeyPatternCreate(sds pattern, int flags) +│ ├── sds sdsCatPatternString(sds base, keyPattern *pat) +│ ├── static int ACLCheckChannelAgainstList(list *reference, const char +│ │ *channel, int channellen, int is_pattern) +│ ├── while((ln = listNext(&li))) +│ ├── static struct config +│ ├── aeEventLoop *el; +│ ├── cliConnInfo conn_info; +│ ├── const char *hostsocket; +│ ├── int tls; +│ ├── struct cliSSLconfig sslconfig; +│ └── } config; +├── 📄 go_test.go (179 tokens, 46 lines) +│ ├── type Greeting struct +│ ├── func (g Greeting) sayHello() +│ ├── func createGreeting(m string) Greeting +│ ├── type SomethingLong struct +│ ├── func (s *SomethingLong) WithAReasonableName( +│ │ ctx context.Context, +│ │ param1 string, +│ │ param2 int, +│ │ param3 map[string]interface{}, +│ │ callback func(int) error, +│ │ ) (resultType, error) +│ ├── type resultType struct +│ └── func main() +├── 📄 PerlTest.pl (63 tokens, 20 lines) +│ ├── package PerlTest +│ ├── package PerlTest -> sub new +│ ├── package PerlTest -> sub hello +│ └── package PerlTest -> sub say_hello +├── 📄 PhpTest.php (70 tokens, 19 lines) +│ ├── class HelloWorld +│ ├── class HelloWorld -> function sayHello +│ ├── function greet +│ ├── class Person +│ └── class Person -> function __construct +├── 📄 PowershellTest.ps1 (459 tokens, 89 lines) +│ ├── function Say-Nothing() +│ ├── class Person +│ ├── Person([string]$name) +│ ├── [string]Greet() +│ ├── [string]GreetMany([int]$times) +│ ├── [string]GreetWithDetails([string]$greeting, [int]$times) +│ ├── [string]GreetMultiline( +│ │ [string]$greeting, +│ │ [int]$times +│ │ ) +│ ├── NoReturn([int]$times) +│ ├── NoReturnNoArgs() +│ ├── function Say-Hello([Person]$person) +│ ├── function Multi-Hello([Person]$personA, [Person]$personB) +│ ├── function Switch-Item +│ ├── param ([switch]$on) +│ ├── function Get-SmallFiles +│ ├── param ( +│ │ [PSDefaultValue(Help = '100')] +│ │ $Size = 100) +│ ├── function Get-User +│ ├── [CmdletBinding(DefaultParameterSetName="ID")] +│ ├── [OutputType("System.Int32", ParameterSetName="ID")] +│ ├── [OutputType([String], ParameterSetName="Name")] +│ ├── Param ( +│ │ [parameter(Mandatory=$true, ParameterSetName="ID")] +│ │ [Int[]] +│ │ $UserID, +│ │ [parameter(Mandatory=$true, ParameterSetName="Name")] +│ │ [String[]] +│ │ $UserName) +│ ├── filter Get-ErrorLog ([switch]$Message) +│ └── function global:MultilineSignature( +│ [string]$param1, +│ [int]$param2, +│ [Parameter(Mandatory=$true)] +│ [string]$param3 +│ ) +├── 📄 ScalaTest.scala (171 tokens, 40 lines) +│ ├── def sumOfSquares(x: Int, y: Int): Int +│ ├── trait Bark +│ ├── def bark: String +│ ├── case class Person(name: String) +│ ├── class GenericClass[T]( +│ │ val data: T, +│ │ val count: Int +│ │ ) +│ ├── def getData: T +│ ├── object HelloWorld +│ ├── def greet(person: Person): Unit +│ ├── def main(args: Array[String]): Unit +│ ├── def complexFunction( +│ │ a: Int, +│ │ b: String, +│ │ c: Float +│ │ ): (Int, String) Option +│ └── def sumOfSquaresShort(x: Int, y: Int): Int +└── 📄 test.csv (0 tokens, 0 lines) + ├── Name + ├── Age + ├── Country + ├── City + └── Email diff --git a/tests/golden/legacy/trees/more_languages_group3.txt b/tests/golden/legacy/trees/more_languages_group3.txt new file mode 100644 index 0000000..efc9ce0 --- /dev/null +++ b/tests/golden/legacy/trees/more_languages_group3.txt @@ -0,0 +1,346 @@ +📁 group3 (1 folder, 16 files) +├── 📄 bash_test.sh (127 tokens, 22 lines) +│ ├── echo_hello_world() +│ ├── function fun_echo_hello_world() +│ ├── export SECRET +│ ├── alias md='make debug' +│ ├── add_alias() +│ └── create_conda_env() +├── 📄 cpp_test.cpp (1,670 tokens, 259 lines) +│ ├── class Person +│ ├── std::string name; +│ ├── public: +│ ├── Person(std::string n) : name(n) +│ ├── void greet() +│ ├── void globalGreet() +│ ├── int main() +│ ├── void printMessage(const std::string &message) +│ ├── template +│ │ void printVector(const std::vector& vec) +│ ├── struct Point +│ ├── int x, y; +│ ├── Point(int x, int y) : x(x), y(y) +│ ├── class Animal +│ ├── public: +│ ├── Animal(const std::string &name) : name(name) +│ ├── virtual void speak() const +│ ├── virtual ~Animal() +│ ├── protected: +│ ├── std::string name; +│ ├── class Dog : public Animal +│ ├── public: +│ ├── Dog(const std::string &name) : Animal(name) +│ ├── void speak() const override +│ ├── class Cat : public Animal +│ ├── public: +│ ├── Cat(const std::string &name) : Animal(name) +│ ├── void speak() const override +│ ├── nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int +│ │ num_layers, +│ │ int batch_size, int max_seq_length, float +│ │ dropout, +│ │ bool bidirectional, bool cudnn_allow_tf32, +│ │ int workspace_size, int reserve_space_size) +│ ├── int main() +│ ├── enum ECarTypes +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV, +│ ├── Wagon +│ ├── ECarTypes GetPreferredCarType() +│ ├── enum ECarTypes : uint8_t +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV = 254, +│ ├── Hybrid +│ ├── enum class ECarTypes : uint8_t +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV = 254, +│ ├── Hybrid +│ ├── void myFunction(string fname, int age) +│ ├── template T cos(T) +│ ├── template T sin(T) +│ ├── template T sqrt(T) +│ ├── template struct VLEN +│ ├── template class arr +│ ├── private: +│ ├── static T *ralloc(size_t num) +│ ├── static void dealloc(T *ptr) +│ ├── static T *ralloc(size_t num) +│ ├── static void dealloc(T *ptr) +│ ├── public: +│ ├── arr() : p(0), sz(0) +│ ├── arr(size_t n) : p(ralloc(n)), sz(n) +│ ├── arr(arr &&other) +│ │ : p(other.p), sz(other.sz) +│ ├── ~arr() +│ ├── void resize(size_t n) +│ ├── T &operator[](size_t idx) +│ ├── T *data() +│ ├── size_t size() const +│ ├── class Buffer +│ ├── private: +│ ├── void* ptr_; +│ └── std::tuple quantize( +│ const array& w, +│ int group_size, +│ int bits, +│ StreamOrDevice s) +├── 📄 csharp_test.cs (957 tokens, 146 lines) +│ ├── public interface IExcelTemplate +│ ├── void LoadTemplate(string templateFilePath) +│ ├── void LoadData(Dictionary data) +│ ├── void ModifyCell(string cellName, string value) +│ ├── void SaveToFile(string filePath) +│ ├── public interface IGreet +│ ├── void Greet() +│ ├── public enum WeekDays +│ ├── public delegate void DisplayMessage(string message) +│ ├── public struct Address +│ ├── public static class HelperFunctions +│ ├── public static void PrintMessage(string message) +│ ├── public static int AddNumbers(int a, int b) +│ ├── namespace HelloWorldApp +│ ├── class Person : IGreet +│ ├── public Person(string name, int age) +│ ├── public void Greet() +│ ├── class HelloWorld +│ ├── static void Main(string[] args) +│ ├── namespace TemplateToExcelServer.Template +│ ├── public interface ITemplateObject +│ ├── string[,] GetContent() +│ ├── string[] GetContentArray() +│ ├── string[] GetFormat() +│ ├── int? GetFormatLength() +│ ├── TemplateObject SetContent(string[,] Content) +│ ├── TemplateObject SetContentArray(string[] value) +│ ├── TemplateObject SetFormat(string[] Header) +│ ├── TemplateObject SetNameOfReport( +│ │ ReadOnlyMemory ReportName, +│ │ int[] EdgeCase) +│ ├── TemplateObject SetSheetName(ReadOnlyMemory SheetName) +│ ├── public class BankAccount(string accountID, string owner) +│ ├── public override string ToString() => +│ ├── var IncrementBy = (int source, int increment = 1) => +│ ├── Func add = (x, y) => +│ ├── button.Click += (sender, args) => +│ ├── public Func GetMultiplier(int factor) +│ ├── public void Method( +│ │ int param1, +│ │ int param2, +│ │ int param3, +│ │ int param4, +│ │ int param5, +│ │ int param6, +│ │ ) +│ ├── System.Net.ServicePointManager.ServerCertificateValidationCallback += +│ │ (se, cert, chain, sslerror) => +│ ├── class ServerCertificateValidation +│ ├── public bool OnRemoteCertificateValidation( +│ │ object se, +│ │ X509Certificate cert, +│ │ X509Chain chain, +│ │ SslPolicyErrors sslerror +│ │ ) +│ ├── s_downloadButton.Clicked += async (o, e) => +│ ├── [HttpGet, Route("DotNetCount")] +│ └── static public async Task GetDotNetCount(string URL) +├── 📄 hallucination.tex (1,633 tokens, 126 lines) +│ ├── Harnessing the Master Algorithm: Strategies for AI LLMs to Mitigate +│ │ Hallucinations +│ ├── Hallucinated Pedro Domingos et al. +│ ├── Christmas Eve 2023 +│ ├── 1 Introduction +│ ├── 2 Representation in LLMs +│ ├── 2.1 Current Representational Models +│ ├── 2.2 Incorporating Cognitive Structures +│ ├── 2.3 Conceptual Diagrams of Advanced Representational Models +│ ├── 3 Evaluation Strategies +│ ├── 3.1 Existing Evaluation Metrics for LLMs +│ ├── 3.2 Integrating Contextual and Ethical Considerations +│ ├── 3.3 Case Studies: Evaluation in Practice +│ ├── 4 Optimization Techniques +│ ├── 4.1 Continuous Learning Models +│ ├── 4.2 Adaptive Algorithms for Real-time Adjustments +│ ├── 4.3 Performance Metrics Pre- and Post-Optimization +│ ├── 5 Interdisciplinary Insights +│ ├── 5.1 Cognitive Science and AI: A Symbiotic Relationship +│ ├── 5.2 Learning from Human Cognitive Processes +│ ├── 6 Challenges and Future Directions +│ ├── 6.1 Addressing Current Limitations +│ ├── 6.2 The Road Ahead: Ethical and Practical Considerations +│ ├── 7 Conclusion +│ ├── 7.1 Summarizing Key Findings +│ └── 7.2 The Next Steps in AI Development +├── 📄 ruby_test.rb (138 tokens, 37 lines) +│ ├── module Greeter +│ ├── def self.say_hello +│ ├── class HelloWorld +│ ├── def say_hello +│ ├── class Human +│ ├── def self.bar +│ ├── def self.bar=(value) +│ ├── class Doctor < Human +│ └── def brachial_plexus( +│ roots, +│ trunks, +│ divisions: true, +│ cords: [], +│ branches: Time.now +│ ) +├── 📄 swift_test.swift (469 tokens, 110 lines) +│ ├── class Person +│ ├── init(name: String) +│ ├── func greet() +│ ├── func yEdgeCase( +│ │ fname: String, +│ │ lname: String, +│ │ age: Int, +│ │ address: String, +│ │ phoneNumber: String +│ │ ) +│ ├── func globalGreet() +│ ├── struct Point +│ ├── protocol Animal +│ ├── func speak() +│ ├── struct Dog: Animal +│ ├── class Cat: Animal +│ ├── init(name: String) +│ ├── func speak() +│ ├── enum CarType +│ ├── func getPreferredCarType() -> CarType +│ ├── enum CarType: UInt8 +│ ├── enum class CarType: UInt8 +│ ├── func myFunction(fname: String, age: Int) +│ └── func myFunctionWithMultipleParameters( +│ fname: String, +│ lname: String, +│ age: Int, +│ address: String, +│ phoneNumber: String +│ ) +├── 📄 test.lean (289 tokens, 42 lines) +│ ├── # Advanced Topics in Group Theory +│ ├── section GroupDynamics +│ ├── lemma group_stability (G : Type*) [Group G] (H : Subgroup G) +│ ├── theorem subgroup_closure {G : Type*} [Group G] (S : Set G) +│ ├── axiom group_homomorphism_preservation {G H : Type*} [Group G] [Group H] +│ │ (f : G → H) +│ ├── end GroupDynamics +│ ├── section ConstructiveApproach +│ ├── lemma finite_group_order (G : Type*) [Group G] [Fintype G] +│ ├── lemma complex_lemma {X Y : Type*} [SomeClass X] [AnotherClass Y] +│ │ (f : X → Y) (g : Y → X) +│ └── end ConstructiveApproach +├── 📄 test.capnp (117 tokens, 30 lines) +│ ├── struct Employee +│ ├── id @0 :Int32 +│ ├── name @1 :Text +│ ├── role @2 :Text +│ ├── skills @3 :List(Skill) +│ ├── struct Skill +│ ├── name @0 :Text +│ ├── level @1 :Level +│ ├── enum Level +│ ├── beginner @0 +│ ├── intermediate @1 +│ ├── expert @2 +│ ├── status :union +│ ├── active @4 :Void +│ ├── onLeave @5 :Void +│ ├── retired @6 :Void +│ ├── struct Company +│ └── employees @0 :List(Employee) +├── 📄 test.graphql (66 tokens, 21 lines) +│ ├── type Query +│ ├── getBooks: [Book] +│ ├── getAuthors: [Author] +│ ├── type Mutation +│ ├── addBook(title: String, author: String): Book +│ ├── removeBook(id: ID): Book +│ ├── type Book +│ ├── id: ID +│ ├── title: String +│ ├── author: Author +│ ├── type Author +│ ├── id: ID +│ ├── name: String +│ └── books: [Book] +├── 📄 test.proto (142 tokens, 34 lines) +│ ├── syntax = "proto3" +│ ├── service EmployeeService +│ ├── rpc GetEmployee(EmployeeId) returns (EmployeeInfo) +│ ├── rpc AddEmployee(EmployeeData) returns (EmployeeInfo) +│ ├── rpc UpdateEmployee(EmployeeUpdate) returns (EmployeeInfo) +│ ├── message EmployeeId +│ ├── int32 id = 1 +│ ├── message EmployeeInfo +│ ├── int32 id = 1 +│ ├── string name = 2 +│ ├── string role = 3 +│ ├── message EmployeeData +│ ├── string name = 1 +│ ├── string role = 2 +│ ├── message EmployeeUpdate +│ ├── int32 id = 1 +│ ├── string name = 2 +│ └── string role = 3 +├── 📄 test.sqlite (0 tokens, 0 lines) +│ ├── students table: +│ ├── id integer primary key +│ ├── name text not null +│ ├── age integer not null +│ ├── courses table: +│ ├── id integer primary key +│ ├── title text not null +│ └── credits integer not null +├── 📄 test_Cargo.toml (119 tokens, 18 lines) +│ ├── name: test_cargo +│ ├── version: 0.1.0 +│ ├── description: A test Cargo.toml +│ ├── license: MIT OR Apache-2.0 +│ ├── dependencies: +│ ├── clap 4.4 +│ └── sqlx 0.7 (features: runtime-tokio, tls-rustls) +├── 📄 test_json_rpc_2_0.json (26 tokens, 6 lines) +│ ├── jsonrpc: 2.0 +│ ├── method: subtract +│ ├── params: +│ ├── minuend: 42 +│ ├── subtrahend: 23 +│ └── id: 1 +├── 📄 test_openapi.yaml (753 tokens, 92 lines) +│ ├── openapi: 3.0.1 +│ ├── title: TODO Plugin +│ ├── description: A plugin to create and manage TODO lists using ChatGPT. +│ ├── version: v1 +│ ├── servers: +│ ├── - url: PLUGIN_HOSTNAME +│ ├── paths: +│ ├── '/todos/{username}': +│ ├── GET (getTodos): Get the list of todos +│ ├── POST (addTodo): Add a todo to the list +│ └── DELETE (deleteTodo): Delete a todo from the list +├── 📄 test_openrpc.json (225 tokens, 44 lines) +│ ├── openrpc: 1.2.1 +│ ├── info: +│ ├── title: Demo Petstore +│ ├── version: 1.0.0 +│ ├── methods: +│ ├── listPets: List all pets +│ ├── params: +│ ├── - limit: integer +│ └── result: pets = An array of pets +└── 📄 test_pyproject.toml (304 tokens, 39 lines) + ├── name: tree_plus + ├── version: 1.0.8 + ├── description: A `tree` util enhanced with tokens, lines, and components. + ├── License :: OSI Approved :: Apache Software License + ├── License :: OSI Approved :: MIT License + ├── dependencies: + ├── tiktoken + ├── PyYAML + ├── click + ├── rich + └── tomli diff --git a/tests/golden/legacy/trees/more_languages_group4.txt b/tests/golden/legacy/trees/more_languages_group4.txt new file mode 100644 index 0000000..3b55b2c --- /dev/null +++ b/tests/golden/legacy/trees/more_languages_group4.txt @@ -0,0 +1,214 @@ +📁 group4 (1 folder, 10 files) +├── 📄 erl_test.erl (480 tokens, 68 lines) +│ ├── -module(erl_test). +│ ├── -record(person). +│ ├── -type ra_peer_status(). +│ ├── -type ra_membership(). +│ ├── -opaque my_opaq_type(). +│ ├── -type orddict(Key, Val). +│ ├── -type edge( +│ │ Cases, +│ │ Pwn, +│ │ ). +│ ├── -spec guarded(X) -> X when X :: tuple(). +│ ├── -spec edge_case( +│ │ {integer(), any()} | [any()] +│ │ ) -> processed, integer(), any()} | [{item, any()}]. +│ ├── -spec complex_function({integer(), any()} | [any()]) -> +│ │ {processed, integer(), any()} | [{item, any()}]. +│ ├── -spec list_manipulation([integer()]) -> [integer()]. +│ ├── -spec overload(T1, T2) -> T3 +│ │ ; (T4, T5) -> T6. +│ ├── -spec multiguard({X, integer()}) -> X when X :: atom() +│ │ ; ([Y]) -> Y when Y :: number(). +│ ├── -record(multiline). +│ └── -record(maybe_undefined). +├── 📄 haskell_test.hs (414 tokens, 41 lines) +│ ├── data Person +│ ├── greet :: Person -> String +│ └── resolveVariables :: +│ forall m fragments. +│ (MonadError QErr m, Traversable fragments) => +│ Options.BackwardsCompatibleNullInNonNullableVariables -> +│ [G.VariableDefinition] -> +│ GH.VariableValues -> +│ [G.Directive G.Name] -> +│ G.SelectionSet fragments G.Name -> +│ m +│ ( [G.Directive Variable], +│ G.SelectionSet fragments Variable +│ ) +├── 📄 mathematica_test.nb (133 tokens, 21 lines) +│ ├── person[name_] +│ ├── sayHello[] +│ └── sumList[list_List] +├── 📄 matlab_test.m (48 tokens, 12 lines) +│ ├── classdef HelloWorld -> function greet +│ └── function loneFun +├── 📄 RTest.R (367 tokens, 46 lines) +│ ├── class(person) +│ ├── greet.Person <- function +│ ├── ensure_between = function +│ └── run_intermediate_annealing_process = function +├── 📄 rust_test.rs (1,368 tokens, 259 lines) +│ ├── fn at_beginning<'a>(&'a str) +│ ├── pub enum Days { +│ │ #\[default] +│ │ Sun, +│ │ Mon, +│ │ #\[error("edge case {idx}, expected at least {} and at most {}", +│ │ .limits.lo, .limits.hi)] +│ │ Tue, +│ │ Wed, +│ │ Thu(i16, bool), +│ │ Fri { day: u8 }, +│ │ Sat { +│ │ urday: String, +│ │ edge_case: E, +│ │ }, +│ │ } +│ ├── struct Point +│ ├── impl Point +│ ├── fn get_origin() -> Point +│ ├── struct Person +│ ├── impl Person +│ ├── fn greet(&self) +│ ├── fn add_two_longs(x1: i64, x2: i64) -> i64 +│ ├── fn add_two_longs_longer( +│ │ x1: i64, +│ │ x2: i64, +│ │ ) -> i64 +│ ├── const fn multiply_by_two(num: f64) -> f64 +│ ├── fn get_first_character(s: &str) -> Option +│ ├── trait Drawable +│ ├── fn draw(&self) +│ ├── impl Drawable for Point +│ ├── fn draw(&self) +│ ├── fn with_generic(d: D) +│ ├── fn with_generic(d: D) +│ │ where +│ │ D: Drawable +│ ├── fn main() +│ ├── pub struct VisibleStruct +│ ├── mod my_module +│ ├── pub struct AlsoVisibleStruct(T, T) +│ ├── macro_rules! say_hello +│ ├── #[macro_export] +│ │ macro_rules! hello_tree_plus +│ ├── pub mod lib +│ ├── pub mod interfaces +│ ├── mod engine +│ ├── pub fn flow( +│ │ source: S1, +│ │ extractor: E, +│ │ inbox: S2, +│ │ transformer: T, +│ │ outbox: S3, +│ │ loader: L, +│ │ sink: &mut S4, +│ │ ) -> Result<(), Box> +│ │ where +│ │ S1: Extractable, +│ │ S2: Extractable + Loadable, +│ │ S3: Extractable + Loadable, +│ │ S4: Loadable, +│ │ E: Extractor, +│ │ T: Transformer, +│ │ L: Loader +│ ├── trait Container +│ ├── fn items(&self) -> impl Iterator +│ ├── trait HttpService +│ ├── async fn fetch(&self, url: Url) -> HtmlBody +│ ├── struct Pair +│ ├── trait Transformer +│ ├── fn transform(&self, input: T) -> T +│ ├── impl + Copy> Transformer for Pair +│ ├── fn transform(&self, input: T) -> T +│ ├── fn main() +│ ├── async fn handle_get(State(pool): State) -> Result, +│ │ (StatusCode, String)> +│ │ where +│ │ Bion: Cool +│ ├── #[macro_export] +│ │ macro_rules! unit +│ ├── fn insert( +│ │ &mut self, +│ │ key: (), +│ │ value: $unit_dtype, +│ │ ) -> Result, ETLError> +│ ├── pub async fn handle_get_axum_route( +│ │ Session { maybe_claims }: Session, +│ │ Path(RouteParams { +│ │ alpha, +│ │ bravo, +│ │ charlie, +│ │ edge_case +│ │ }): Path, +│ │ ) -> ServerResult +│ ├── fn encode_pipeline(cmds: &[Cmd], atomic: bool) -> Vec +│ ├── pub async fn handle_post_yeet( +│ │ State(auth_backend): State, +│ │ Session { maybe_claims }: Session, +│ │ Form(yeet_form): Form, +│ │ ) -> Result +│ └── pub async fn handle_get_thingy( +│ session: Session, +│ State(ApiBackend { +│ page_cache, +│ auth_backend, +│ library_sql, +│ some_data_cache, +│ metadata_cache, +│ thingy_client, +│ .. +│ }): State, +│ ) -> ServerResult +├── 📄 test.zig (397 tokens, 60 lines) +│ ├── pub fn add(a: i32, b: i32) i32 +│ ├── test "add function" +│ ├── const BunBuildOptions = struct +│ ├── pub fn updateRuntime(this: *BunBuildOptions) anyerror!void +│ ├── pub fn step(this: BunBuildOptions, b: anytype) +│ │ *std.build.OptionsStep +│ └── pub fn sgemv( +│ order: Order, +│ trans: Trans, +│ m: usize, +│ n: usize, +│ alpha: f32, +│ a: []const f32, +│ lda: usize, +│ x: []const f32, +│ x_add: usize, +│ beta: f32, +│ y: []f32, +│ y_add: usize, +│ ) void +├── 📄 test_fsharp.fs (92 tokens, 27 lines) +│ ├── module TestFSharp +│ ├── type Person = { +│ ├── let add x y = +│ ├── let multiply +│ │ (x: int) +│ │ (y: int): int = +│ ├── let complexFunction +│ │ (a: int) +│ │ (b: string) +│ │ (c: float) +│ │ : (int * string) option = +│ └── type Result<'T> = +├── 📄 test_tcl_tk.tcl (54 tokens, 16 lines) +│ ├── proc sayHello {} +│ ├── proc arrg { input } +│ └── proc multiLine { +│ x, +│ y +│ } +└── 📄 tf_test.tf (202 tokens, 38 lines) + ├── provider "aws" + ├── resource "aws_instance" "example" + ├── data "aws_ami" "ubuntu" + ├── variable "instance_type" + ├── output "instance_public_ip" + ├── locals + └── module "vpc" diff --git a/tests/golden/legacy/trees/more_languages_group5.txt b/tests/golden/legacy/trees/more_languages_group5.txt new file mode 100644 index 0000000..9acc6f2 --- /dev/null +++ b/tests/golden/legacy/trees/more_languages_group5.txt @@ -0,0 +1,257 @@ +📁 group5 (1 folder, 19 files) +├── 📄 ansible_test.yml (55 tokens, 14 lines) +│ ├── Install package +│ ├── Start service +│ └── Create user +├── 📄 app-routing.module.ts (287 tokens, 28 lines) +│ ├── const routes: Routes = [ +│ │ { path: '', redirectTo: 'login', pathMatch: 'full' }, +│ │ { path: '*', redirectTo: 'login' }, +│ │ { path: 'home', component: HomeComponent }, +│ │ { path: 'login', component: LoginComponent }, +│ │ { path: 'register', component: RegisterComponent }, +│ │ { path: 'events', component: EventsComponent }, +│ │ { path: 'invites', component: InvitesComponent }, +│ │ { path: 'rewards', component: RewardsComponent }, +│ │ { path: 'profile', component: ProfileComponent }, +│ │ ]; +│ └── export class AppRoutingModule +├── 📄 app.component.spec.ts (410 tokens, 47 lines) +│ ├── describe 'AppComponent' +│ ├── it should create the app +│ ├── it should welcome the user +│ ├── it should welcome 'Jimbo' +│ └── it should request login if not logged in +├── 📄 app.component.ts (271 tokens, 45 lines) +│ ├── export class AppComponent +│ ├── constructor( +│ │ private http: HttpClient, +│ │ private loginService: LoginService, +│ │ private stripeService: StripeService +│ │ ) +│ ├── constructor(private loginService: LoginService) +│ ├── checkSession() +│ ├── async goToEvent(event_id: string) +│ └── valInvitedBy(event: any, event_id: string) +├── 📄 app.module.ts (374 tokens, 43 lines) +│ ├── @NgModule({ +│ │ declarations: [ +│ │ AppComponent, +│ │ HomeComponent, +│ │ LoginComponent, +│ │ RegisterComponent, +│ │ EventsComponent, +│ │ InvitesComponent, +│ │ RewardsComponent, +│ │ ProfileComponent +│ └── export class AppModule +├── 📄 checkbox_test.md (191 tokens, 29 lines) +│ ├── # My Checkbox Test +│ ├── ## My No Parens Test +│ ├── ## My Empty href Test +│ ├── ## My other url Test [Q&A] +│ ├── ## My other other url Test [Q&A] +│ ├── ## My 2nd other url Test [Q&A] +│ ├── ## My 3rd other url Test [Q&A] +│ ├── - [ ] Task 1 +│ ├── - [ ] No Space Task 1.1 +│ ├── - [ ] Two Spaces Task 1.2 +│ ├── - [ ] Subtask 1.2.1 +│ ├── - [ ] Task 2 +│ ├── - [x] Task 3 +│ ├── - [ ] Subtask 3.1 +│ ├── - [x] Task 6 +│ ├── - [x] Subtask 6.1 +│ ├── - [ ] Handle edge cases +│ └── # My Codeblock Test +├── 📄 checkbox_test.txt (257 tokens, 33 lines) +│ ├── - [ ] fix phone number format +1 +│ ├── - [ ] add forgot password +│ ├── - [ ] ? add email verification +│ ├── - [ ] store token the right way +│ ├── - [ ] test nesting of checkboxes +│ ├── - [ ] user can use option to buy ticket at 2-referred price +│ ├── - [ ] CTA refer 2 people to get instant lower price +│ └── - [ ] form to send referrals +├── 📄 environment.test.ts (197 tokens, 19 lines) +│ ├── environment: +│ ├── production +│ ├── cognitoUserPoolId +│ ├── cognitoAppClientId +│ └── apiurl +├── 📄 hello_world.pyi (22 tokens, 3 lines) +│ ├── @final +│ │ class dtype(Generic[_DTypeScalar_co]) +│ └── names: None | tuple[builtins.str, ...] +├── 📄 k8s_test.yaml (140 tokens, 37 lines) +│ ├── apps/v1.Deployment -> my-app +│ ├── v1.Service -> my-service +│ └── v1.ConfigMap -> my-config +├── 📄 Makefile (714 tokens, 84 lines) +│ ├── include dotenv/dev.env +│ ├── .PHONY: dev +│ ├── dev +│ ├── services-down +│ ├── services-stop: services-down +│ ├── define CHECK_POSTGRES +│ ├── damage-report +│ ├── tail-logs +│ └── cloud +├── 📄 requirements_test.txt (29 tokens, 10 lines) +│ ├── psycopg2-binary +│ ├── pytest +│ ├── coverage +│ ├── flask[async] +│ ├── flask_cors +│ ├── stripe +│ ├── pyjwt[crypto] +│ ├── cognitojwt[async] +│ └── flask-lambda +├── 📄 rust_todo_test.rs (92 tokens, 26 lines) +│ ├── TODO: This todo tests parse_todo +│ ├── enum Color { +│ │ Red, +│ │ Blue, +│ │ Green, +│ │ } +│ ├── struct Point +│ ├── trait Drawable +│ ├── fn draw(&self) +│ ├── impl Drawable for Point +│ ├── fn draw(&self) +│ └── fn main() +├── 📄 sql_test.sql (270 tokens, 51 lines) +│ ├── CREATE TABLE promoters +│ ├── user_id serial PRIMARY KEY, +│ ├── type varchar(20) NOT NULL, +│ ├── username varchar(20) NOT NULL, +│ ├── password varchar(20) NOT NULL, +│ ├── email varchar(30) NOT NULL, +│ ├── phone varchar(20) NOT NULL, +│ ├── promocode varchar(20), +│ ├── info json, +│ ├── going text[], +│ ├── invites text[], +│ ├── balance integer NOT NULL, +│ ├── rewards text[], +│ ├── created timestamp +│ ├── CREATE TABLE events +│ ├── event_id serial PRIMARY KEY, +│ ├── name varchar(64) NOT NULL, +│ ├── date varchar(64) NOT NULL, +│ ├── location varchar(64) NOT NULL, +│ ├── performer varchar(64) NOT NULL, +│ ├── rewards json, +│ └── created timestamp +├── 📄 standard-app-routing.module.ts (100 tokens, 16 lines) +│ └── const routes: Routes = [ +│ { path: '', component: HomeComponent }, +│ { +│ path: 'heroes', +│ component: HeroesListComponent, +│ children: [ +│ { path: ':id', component: HeroDetailComponent }, +│ { path: 'new', component: HeroFormComponent }, +│ ], +│ }, +│ { path: '**', component: PageNotFoundComponent }, +│ ]; +├── 📄 test.env (190 tokens, 25 lines) +│ ├── PROMO_PATH +│ ├── PRODUCTION +│ ├── SQL_SCHEMA_PATH +│ ├── DB_LOGS +│ ├── DB_LOG +│ ├── PGPASSWORD +│ ├── PGDATABASE +│ ├── PGHOST +│ ├── PGPORT +│ ├── PGUSER +│ ├── SERVER_LOG +│ ├── SERVER_LOGS +│ ├── API_URL +│ ├── APP_LOGS +│ ├── APP_LOG +│ ├── APP_URL +│ ├── COGNITO_USER_POOL_ID +│ ├── COGNITO_APP_CLIENT_ID +│ ├── AWS_REGION +│ └── STRIPE_SECRET_KEY +├── 📄 testJsonSchema.json (421 tokens, 48 lines) +│ ├── $schema: http://json-schema.org/draft-07/schema# +│ ├── type: object +│ ├── title: random_test +│ └── description: A promoter's activites related to events +├── 📄 testPackage.json (349 tokens, 43 lines) +│ ├── name: 'promo-app' +│ ├── version: 0.0.0 +│ ├── scripts: +│ ├── ng: 'ng' +│ ├── start: 'ng serve' +│ ├── build: 'ng build' +│ ├── watch: 'ng build --watch --configuration development' +│ └── test: 'ng test' +└── 📄 tickets.component.ts (7,160 tokens, 903 lines) + ├── interface EnrichedTicket extends Ticket + ├── interface SpinConfig + ├── interface RotationState + ├── interface SpeakInput + ├── const formatSpeakInput = (input: SpeakInput): string => + ├── function hourToSpeech(hour: number, minute: number, period: string): + │ string + ├── export class TicketsComponent implements AfterViewInit + ├── speak(input: SpeakInput) + ├── speakEvent(ticket: EnrichedTicket): void + ├── formatEvent(ticket: EnrichedTicket): string + ├── speakVenue(ticket: EnrichedTicket): void + ├── formatDate(date: Date, oneLiner: boolean = false): string + ├── formatDateForSpeech(date: Date): string + ├── async spinQRCode( + │ event: PointerEvent, + │ config: SpinConfig = DEFAULT_SPIN_CONFIG + │ ) + ├── private animateRotation( + │ imgElement: HTMLElement, + │ targetRotation: number, + │ config: SpinConfig, + │ cleanup: () => void + │ ) + ├── const animate = (currentTime: number) => + ├── requestAnimationFrame(animate) + ├── cleanup() + ├── requestAnimationFrame(animate) + ├── private getNext90Degree(currentRotation: number): number + ├── private getCurrentRotation(matrix: string): number + ├── ngAfterViewInit() + ├── const mouseEnterListener = () => + ├── const mouseLeaveListener = () => + ├── ngOnDestroy() + ├── toggleColumn(event: MatOptionSelectionChange, column: string) + ├── adjustColumns(event?: Event) + ├── onResize(event: Event) + ├── async ngOnInit() + ├── async loadTickets(): Promise + ├── onDateRangeChange( + │ type: "start" | "end", + │ event: MatDatepickerInputEvent + │ ) + ├── applyFilter(column: string): void + ├── formatDateForComparison(date: Date): string + ├── constructor(private renderer: Renderer2) + ├── onFilterChange(event: Event, column: string) + ├── onLatitudeChange(event: Event) + ├── onLongitudeChange(event: Event) + ├── onRadiusChange(event: Event) + ├── sortData(sort: Sort): void + ├── onRowClick(event: Event, row: any) + ├── function isDate(value: Date | undefined | null): value is Date + ├── function isNonNullNumber(value: number | null): value is number + ├── function hasLocation( + │ ticket: any + │ ): ticket is + ├── const create_faker_ticket = async () => + ├── function compare(a: number | string, b: number | string, isAsc: boolean) + ├── function compare_dates(a: Date, b: Date, isAsc: boolean) + ├── async function mockMoreTickets(): Promise + ├── const mockTickets = async () => + └── const renderQRCode = async (text: String): Promise => diff --git a/tests/golden/legacy/trees/more_languages_group6.txt b/tests/golden/legacy/trees/more_languages_group6.txt new file mode 100644 index 0000000..63e4015 --- /dev/null +++ b/tests/golden/legacy/trees/more_languages_group6.txt @@ -0,0 +1,615 @@ +📁 group6 (1 folder, 14 files) +├── 📄 catastrophic.c (5,339 tokens, 754 lines) +│ ├── TODO: technically we should use a proper parser +│ ├── struct Point +│ ├── int x; +│ ├── int y; +│ ├── struct Point getOrigin() +│ ├── float mul_two_floats(float x1, float x2) +│ ├── enum days +│ ├── SUN, +│ ├── MON, +│ ├── TUE, +│ ├── WED, +│ ├── THU, +│ ├── FRI, +│ ├── SAT +│ ├── enum worker_pool_flags +│ ├── POOL_BH = 1 << 0, +│ ├── POOL_MANAGER_ACTIVE = 1 << 1, +│ ├── POOL_DISASSOCIATED = 1 << 2, +│ ├── POOL_BH_DRAINING = 1 << 3, +│ ├── enum worker_flags +│ ├── WORKER_DIE = 1 << 1, +│ ├── WORKER_IDLE = 1 << 2, +│ ├── WORKER_PREP = 1 << 3, +│ ├── WORKER_CPU_INTENSIVE = 1 << 6, +│ ├── WORKER_UNBOUND = 1 << 7, +│ ├── WORKER_REBOUND = 1 << 8, +│ ├── WORKER_NOT_RUNNING = WORKER_PREP | WORKER_CPU_INTENSIVE | +│ │ WORKER_UNBOUND | WORKER_REBOUND, +│ ├── struct worker_pool +│ ├── raw_spinlock_t lock; +│ ├── int cpu; +│ ├── int node; +│ ├── int id; +│ ├── unsigned int flags; +│ ├── unsigned long watchdog_ts; +│ ├── bool cpu_stall; +│ ├── int nr_running; +│ ├── struct list_head worklist; +│ ├── int nr_workers; +│ ├── int nr_idle; +│ ├── struct list_head idle_list; +│ ├── struct timer_list idle_timer; +│ ├── struct work_struct idle_cull_work; +│ ├── struct timer_list mayday_timer; +│ ├── struct worker *manager; +│ ├── struct list_head workers; +│ ├── struct ida worker_ida; +│ ├── struct workqueue_attrs *attrs; +│ ├── struct hlist_node hash_node; +│ ├── int refcnt; +│ ├── struct rcu_head rcu; +│ ├── long add_two_longs(long x1, long x2) +│ ├── double multiplyByTwo(double num) +│ ├── char getFirstCharacter(char *str) +│ ├── void greet(Person p) +│ ├── typedef struct +│ ├── char name[50]; +│ ├── } Person; +│ ├── typedef struct PersonA +│ ├── char name[50]; +│ ├── } PersonB; +│ ├── int main() +│ ├── int* getArrayStart(int arr[], int size) +│ ├── long complexFunctionWithMultipleArguments( +│ │ int param1, +│ │ double param2, +│ │ char *param3, +│ │ struct Point point +│ │ ) +│ ├── keyPattern *ACLKeyPatternCreate(sds pattern, int flags) +│ ├── sds sdsCatPatternString(sds base, keyPattern *pat) +│ ├── static int ACLCheckChannelAgainstList(list *reference, const char +│ │ *channel, int channellen, int is_pattern) +│ ├── while((ln = listNext(&li))) +│ ├── static struct config +│ ├── aeEventLoop *el; +│ ├── cliConnInfo conn_info; +│ ├── const char *hostsocket; +│ ├── int tls; +│ ├── struct cliSSLconfig sslconfig; +│ ├── } config; +│ ├── class Person +│ ├── std::string name; +│ ├── public: +│ ├── Person(std::string n) : name(n) +│ ├── void greet() +│ ├── void globalGreet() +│ ├── int main() +│ ├── void printMessage(const std::string &message) +│ ├── template +│ │ void printVector(const std::vector& vec) +│ ├── struct foo +│ ├── char x; +│ ├── struct foo_in +│ ├── char* y; +│ ├── short z; +│ ├── } inner; +│ ├── struct Point +│ ├── int x, y; +│ ├── Point(int x, int y) : x(x), y(y) +│ ├── class Animal +│ ├── public: +│ ├── Animal(const std::string &name) : name(name) +│ ├── virtual void speak() const +│ ├── virtual ~Animal() +│ ├── protected: +│ ├── std::string name; +│ ├── class Dog : public Animal +│ ├── public: +│ ├── Dog(const std::string &name) : Animal(name) +│ ├── void speak() const override +│ ├── class Cat : public Animal +│ ├── public: +│ ├── Cat(const std::string &name) : Animal(name) +│ ├── void speak() const override +│ ├── class CatDog: public Animal, public Cat, public Dog +│ ├── public: +│ ├── CatDog(const std::string &name) : Animal(name) +│ ├── int meow_bark() +│ ├── nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int +│ │ num_layers, +│ │ int batch_size, int max_seq_length, float +│ │ dropout, +│ │ bool bidirectional, bool cudnn_allow_tf32, +│ │ int workspace_size, int reserve_space_size) +│ ├── int main() +│ ├── enum ECarTypes +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV, +│ ├── Wagon +│ ├── ECarTypes GetPreferredCarType() +│ ├── enum ECarTypes : uint8_t +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV = 254, +│ ├── Hybrid +│ ├── enum class ECarTypes : uint8_t +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV = 254, +│ ├── Hybrid +│ ├── void myFunction(string fname, int age) +│ ├── template T cos(T) +│ ├── template T sin(T) +│ ├── template T sqrt(T) +│ ├── template struct VLEN +│ ├── template class arr +│ ├── private: +│ ├── static T *ralloc(size_t num) +│ ├── static void dealloc(T *ptr) +│ ├── static T *ralloc(size_t num) +│ ├── static void dealloc(T *ptr) +│ ├── public: +│ ├── arr() : p(0), sz(0) +│ ├── arr(size_t n) : p(ralloc(n)), sz(n) +│ ├── arr(arr &&other) +│ │ : p(other.p), sz(other.sz) +│ ├── ~arr() +│ ├── void resize(size_t n) +│ ├── T &operator[](size_t idx) +│ ├── T *data() +│ ├── size_t size() const +│ ├── class Buffer +│ ├── private: +│ ├── void* ptr_; +│ ├── std::tuple quantize( +│ │ const array& w, +│ │ int group_size, +│ │ int bits, +│ │ StreamOrDevice s) +│ ├── #define PY_SSIZE_T_CLEAN +│ ├── #define PLATFORM_IS_X86 +│ ├── #define PLATFORM_WINDOWS +│ ├── #define GETCPUID(a, b, c, d, a_inp, c_inp) +│ ├── static int GetXCR0EAX() +│ ├── #define GETCPUID(a, b, c, d, a_inp, c_inp) +│ ├── static int GetXCR0EAX() +│ ├── asm("XGETBV" : "=a"(eax), "=d"(edx) : "c"(0)) +│ ├── static void ReportMissingCpuFeature(const char* name) +│ ├── static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args) +│ ├── static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args) +│ ├── static PyMethodDef cpu_feature_guard_methods[] +│ ├── static struct PyModuleDef cpu_feature_guard_module +│ ├── #define EXPORT_SYMBOL __declspec(dllexport) +│ ├── #define EXPORT_SYMBOL __attribute__ ((visibility("default"))) +│ ├── EXPORT_SYMBOL PyMODINIT_FUNC PyInit_cpu_feature_guard(void) +│ ├── typedef struct +│ ├── GPT2Config config; +│ ├── ParameterTensors params; +│ ├── size_t param_sizes[NUM_PARAMETER_TENSORS]; +│ ├── float* params_memory; +│ ├── size_t num_parameters; +│ ├── ParameterTensors grads; +│ ├── float* grads_memory; +│ ├── float* m_memory; +│ ├── float* v_memory; +│ ├── ActivationTensors acts; +│ ├── size_t act_sizes[NUM_ACTIVATION_TENSORS]; +│ ├── float* acts_memory; +│ ├── size_t num_activations; +│ ├── ActivationTensors grads_acts; +│ ├── float* grads_acts_memory; +│ ├── int batch_size; +│ ├── int seq_len; +│ ├── int* inputs; +│ ├── int* targets; +│ ├── float mean_loss; +│ └── } GPT2; +├── 📄 cpp_examples_impl.cc (60 tokens, 10 lines) +│ ├── PYBIND11_MODULE(cpp_examples, m) +│ └── m.def("add", &add, "An example function to add two numbers.") +├── 📄 cpp_examples_impl.cu (37 tokens, 10 lines) +│ ├── template +│ │ T add(T a, T b) +│ └── template <> +│ int add(int a, int b) +├── 📄 cpp_examples_impl.h (22 tokens, 6 lines) +│ ├── template +│ │ T add(T a, T b) +│ └── template <> +│ int add(int, int) +├── 📄 edge_case.hpp (426 tokens, 28 lines) +├── 📄 fractal.thy (1,712 tokens, 147 lines) +│ ├── Title: fractal.thy +│ ├── Author: Isabelle/HOL Contributors! +│ ├── Author: edge cases r us +│ ├── theory Simplified_Ring +│ ├── section ‹Basic Algebraic Structures› +│ ├── class everything = nothing + itself +│ ├── subsection ‹Monoids› +│ ├── definition ring_hom :: "[('a, 'm) ring_scheme, ('b, 'n) ring_scheme] => +│ │ ('a => 'b) set" +│ ├── fun example_fun :: "nat ⇒ nat" +│ ├── locale monoid = +│ │ fixes G (structure) +│ │ assumes m_closed: "⟦x ∈ carrier G; y ∈ carrier G⟧ ⟹ x ⊗ y ∈ carrier +│ │ G" +│ │ and m_assoc: "⟦x ∈ carrier G; y ∈ carrier G; z ∈ carrier G⟧ ⟹ (x ⊗ +│ │ y) ⊗ z = x ⊗ (y ⊗ z)" +│ │ and one_closed: "𝟭 ∈ carrier G" +│ │ and l_one: "x ∈ carrier G ⟹ 𝟭 ⊗ x = x" +│ │ and r_one: "x ∈ carrier G ⟹ x ⊗ 𝟭 = x" +│ ├── subsection ‹Groups› +│ ├── locale group = monoid + +│ │ assumes Units_closed: "x ∈ Units G ⟹ x ∈ carrier G" +│ │ and l_inv_ex: "x ∈ carrier G ⟹ ∃ y ∈ carrier G. y ⊗ x = 𝟭" +│ │ and r_inv_ex: "x ∈ carrier G ⟹ ∃ y ∈ carrier G. x ⊗ y = 𝟭" +│ ├── subsection ‹Rings› +│ ├── locale ring = abelian_group R + monoid R + +│ │ assumes l_distr: "⟦x ∈ carrier R; y ∈ carrier R; z ∈ carrier R⟧ ⟹ (x +│ │ ⊕ y) ⊗ z = x ⊗ z ⊕ y ⊗ z" +│ │ and r_distr: "⟦x ∈ carrier R; y ∈ carrier R; z ∈ carrier R⟧ ⟹ z ⊗ +│ │ (x ⊕ y) = z ⊗ x ⊕ z ⊗ y" +│ ├── locale commutative_ring = ring + +│ │ assumes m_commutative: "⟦x ∈ carrier R; y ∈ carrier R⟧ ⟹ x ⊗ y = y ⊗ +│ │ x" +│ ├── locale domain = commutative_ring + +│ │ assumes no_zero_divisors: "⟦a ⊗ b = 𝟬; a ∈ carrier R; b ∈ carrier R⟧ ⟹ +│ │ a = 𝟬 ∨ b = 𝟬" +│ ├── locale field = domain + +│ │ assumes inv_ex: "x ∈ carrier R - {𝟬} ⟹ inv x ∈ carrier R" +│ ├── subsection ‹Morphisms› +│ ├── lemma example_lemma: "example_fun n = n" +│ ├── qualified lemma gcd_0: +│ │ "gcd a 0 = normalize a" +│ ├── lemma abelian_monoidI: +│ │ fixes R (structure) +│ │ and f :: "'edge::{} ⇒ 'case::{}" +│ │ assumes "⋀x y. ⟦ x ∈ carrier R; y ∈ carrier R ⟧ ⟹ x ⊕ y ∈ carrier R" +│ │ and "𝟬 ∈ carrier R" +│ │ and "⋀x y z. ⟦ x ∈ carrier R; y ∈ carrier R; z ∈ carrier R ⟧ ⟹ (x +│ │ ⊕ y) ⊕ z = x ⊕ (y ⊕ z)" +│ │ shows "abelian_monoid R" +│ ├── lemma euclidean_size_gcd_le1 [simp]: +│ │ assumes "a ≠ 0" +│ │ shows "euclidean_size (gcd a b) ≤ euclidean_size a" +│ ├── theorem Residue_theorem: +│ │ fixes S pts::"complex set" and f::"complex ⇒ complex" +│ │ and g::"real ⇒ complex" +│ │ assumes "open S" "connected S" "finite pts" and +│ │ holo:"f holomorphic_on S-pts" and +│ │ "valid_path g" and +│ │ loop:"pathfinish g = pathstart g" and +│ │ "path_image g ⊆ S-pts" and +│ │ homo:"∀z. (z ∉ S) ⟶ winding_number g z = 0" +│ │ shows "contour_integral g f = 2 * pi * 𝗂 *(∑p ∈ pts. winding_number g +│ │ p * residue f p)" +│ ├── corollary fps_coeff_residues_bigo': +│ │ fixes f :: "complex ⇒ complex" and r :: real +│ │ assumes exp: "f has_fps_expansion F" +│ │ assumes "open A" "connected A" "cball 0 r ⊆ A" "r > 0" +│ │ assumes "f holomorphic_on A - S" "S ⊆ ball 0 r" "finite S" "0 ∉ S" +│ │ assumes "eventually (λn. g n = -(∑z ∈ S. residue (λz. f z / z ^ Suc n) +│ │ z)) sequentially" +│ │ (is "eventually (λn. _ = -?g' n) _") +│ │ shows "(λn. fps_nth F n - g n) ∈ O(λn. 1 / r ^ n)" (is "(λn. ?c n - +│ │ _) ∈ O(_)") +│ └── end +├── 📄 Microsoft.PowerShell_profile.ps1 (3,346 tokens, 497 lines) +│ ├── function Log($message) +│ ├── function Remove-ChocolateyFromPath +│ ├── function Show-Profiles +│ ├── function Show-Path +│ ├── function Show-Error($err) +│ ├── function Get-ScoopPackagePath +│ ├── param( +│ │ [Parameter(Mandatory = $true)] +│ │ [string]$PackageName) +│ ├── function Check-Command +│ ├── param( +│ │ [Parameter(Mandatory = $true)] +│ │ [string]$Name) +│ ├── function Add-ToPath +│ ├── param( +│ │ [Parameter(Mandatory = $true)] +│ │ [string]$PathToAdd) +│ ├── function Install-Scoop +│ ├── function Scoop-Install +│ ├── param( +│ │ [Parameter(Mandatory = $true)] +│ │ [string]$Name, +│ │ [string]$PathToAdd) +│ ├── function Start-CondaEnv +│ ├── function Install-PipPackage +│ ├── param( +│ │ [Parameter(Mandatory = $true)] +│ │ [string]$PackageName) +│ ├── function Install-VSBuildTools +│ ├── function Install-Crate +│ ├── param( +│ │ [Parameter(Mandatory = $true)] +│ │ [string]$CrateName) +│ ├── function Get-ScoopVersion +│ ├── function Get-Version +│ ├── param( +│ │ [Parameter(Mandatory = $true)] +│ │ [string]$ExecutablePath, +│ │ [string]$ExecutableName) +│ ├── function Show-Requirements +│ ├── function Measure-Status +│ ├── param( +│ │ [Parameter(Mandatory = $true)] +│ │ [string]$Name) +│ ├── function Find-Profile +│ ├── function Edit-Profile +│ ├── function Set-Profile +│ └── function Show-Profile +├── 📄 python_complex_class.py (10 tokens, 2 lines) +│ └── class Box(Space[NDArray[Any]]) +├── 📄 ramda__cloneRegExp.js (173 tokens, 9 lines) +│ └── export default function _cloneRegExp(pattern) +├── 📄 ramda_prop.js (646 tokens, 85 lines) +│ ├── /** +│ │ * Returns a function that when supplied an object returns the indicated +│ │ * property of that object, if it exists. +│ │ * @category Object +│ │ * @typedefn Idx = String | Int | Symbol +│ │ * @sig Idx -> {s: a} -> a | Undefined +│ │ * @param {String|Number} p The property name or array index +│ │ * @param {Object} obj The object to query +│ │ * @return {*} The value at `obj.p`. +│ │ */ +│ │ var prop = _curry2(function prop(p, obj) +│ ├── /** +│ │ * Solves equations of the form a * x = b +│ │ * @param {{ +│ │ * z: number +│ │ * }} x +│ │ */ +│ │ function foo(x) +│ ├── /** +│ │ * Deconstructs an array field from the input documents to output a +│ │ document for each element. +│ │ * Each output document is the input document with the value of the +│ │ array field replaced by the element. +│ │ * @category Object +│ │ * @sig String -> {k: [v]} -> [{k: v}] +│ │ * @param {String} key The key to determine which property of the object +│ │ should be unwound. +│ │ * @param {Object} object The object containing the list to unwind at +│ │ the property named by the key. +│ │ * @return {List} A list of new objects, each having the given key +│ │ associated to an item from the unwound list. +│ │ */ +│ │ var unwind = _curry2(function(key, object) +│ └── return _map(function(item) +├── 📄 tensorflow_flags.h (7,628 tokens, 668 lines) +│ ├── TF_DECLARE_FLAG('test_only_experiment_1') +│ ├── TF_DECLARE_FLAG('test_only_experiment_2') +│ ├── TF_DECLARE_FLAG('enable_nested_function_shape_inference'): +│ │ Allow ops such as tf.cond to invoke the ShapeRefiner on their +│ │ nested functions. +│ ├── TF_DECLARE_FLAG('enable_quantized_dtypes_training'): +│ │ Set quantized dtypes, like tf.qint8, to be trainable. +│ ├── TF_DECLARE_FLAG('graph_building_optimization'): +│ │ Optimize graph building for faster tf.function tracing. +│ ├── TF_DECLARE_FLAG('saved_model_fingerprinting'): +│ │ Add fingerprint to SavedModels. +│ ├── TF_DECLARE_FLAG('more_stack_traces'): +│ │ Enable experimental code that preserves and propagates graph +│ │ node stack traces in C++. +│ ├── TF_DECLARE_FLAG('publish_function_graphs'): +│ │ Enables the publication of partitioned function graphs via +│ │ StatsPublisherInterface. Disabling this flag can reduce memory +│ │ consumption. +│ ├── TF_DECLARE_FLAG('enable_aggressive_constant_replication'): +│ │ Replicate constants across CPU devices and even for local CPUs +│ │ within the same task if available. +│ ├── TF_DECLARE_FLAG('enable_colocation_key_propagation_in_while_op_lowering' +│ │ ): +│ │ If true, colocation key attributes for the ops will be +│ │ propagated during while op lowering to switch/merge ops. +│ ├── Flag('tf_xla_auto_jit'): +│ │ Control compilation of operators into XLA computations on CPU +│ │ and GPU devices. 0 = use ConfigProto setting; -1 = off; 1 = on for +│ │ things very likely to be improved; 2 = on for everything; (experimental) +│ │ fusible = only for Tensorflow operations that XLA knows how to fuse. If +│ │ set to single-gpu() then this resolves to for single-GPU graphs +│ │ (graphs that have at least one node placed on a GPU and no more than one +│ │ GPU is in use through the entire graph) and 0 otherwise. Experimental. +│ ├── Flag('tf_xla_min_cluster_size'): +│ │ Minimum number of operators in an XLA compilation. Ignored for +│ │ operators placed on an XLA device or operators explicitly marked for +│ │ compilation. +│ ├── Flag('tf_xla_max_cluster_size'): +│ │ Maximum number of operators in an XLA compilation. +│ ├── Flag('tf_xla_cluster_exclude_ops'): +│ │ (experimental) Exclude the operations from auto-clustering. If +│ │ multiple, separate them with commas. Where, Some_other_ops. +│ ├── Flag('tf_xla_clustering_debug'): +│ │ Dump graphs during XLA compilation. +│ ├── Flag('tf_xla_cpu_global_jit'): +│ │ Enables global JIT compilation for CPU via SessionOptions. +│ ├── Flag('tf_xla_clustering_fuel'): +│ │ Places an artificial limit on the number of ops marked as +│ │ eligible for clustering. +│ ├── Flag('tf_xla_disable_deadness_safety_checks_for_debugging'): +│ │ Disable deadness related safety checks when clustering (this is +│ │ unsound). +│ ├── Flag('tf_xla_disable_resource_variable_safety_checks_for_debugging'): +│ │ Disable resource variables related safety checks when clustering +│ │ (this is unsound). +│ ├── Flag('tf_xla_deterministic_cluster_names'): +│ │ Causes the function names assigned by auto clustering to be +│ │ deterministic from run to run. +│ ├── Flag('tf_xla_persistent_cache_directory'): +│ │ If non-empty, JIT-compiled executables are saved to and loaded +│ │ from the specified file system directory path. Empty by default. +│ ├── Flag('tf_xla_persistent_cache_device_types'): +│ │ If non-empty, the persistent cache will only be used for the +│ │ specified devices (comma separated). Each device type should be able to +│ │ be converted to. +│ ├── Flag('tf_xla_persistent_cache_read_only'): +│ │ If true, the persistent cache will be read-only. +│ ├── Flag('tf_xla_disable_strict_signature_checks'): +│ │ If true, entires loaded into the XLA compile cache will not have +│ │ their signatures checked strictly. Defaults to false. +│ ├── Flag('tf_xla_persistent_cache_prefix'): +│ │ Specifies the persistance cache prefix. Default is. +│ ├── Flag('tf_xla_sparse_core_disable_table_stacking'): +│ │ Disable table stacking for all the tables passed to the +│ │ SparseCore mid level API. +│ ├── Flag('tf_xla_sparse_core_minibatch_max_division_level'): +│ │ Max level of division to split input data into minibatches. +│ ├── Flag('tf_xla_sparse_core_stacking_mem_limit_bytes'): +│ │ If non-zero, limits the size of the activations for a given +│ │ table to be below these many bytes. +│ ├── Flag('tf_xla_sparse_core_stacking_table_shard_limit_bytes'): +│ │ If non-zero, limits the size of any table shard to be below +│ │ these many bytes. +│ ├── Flag('always_specialize') +│ ├── Flag('cost_driven_async_parallel_for') +│ ├── Flag('enable_crash_reproducer') +│ ├── Flag('log_query_of_death') +│ ├── Flag('vectorize') +│ ├── Flag('tf_xla_enable_lazy_compilation') +│ ├── Flag('tf_xla_print_cluster_outputs'): +│ │ If true then insert Print nodes to print out values produced by +│ │ XLA clusters. +│ ├── Flag('tf_xla_check_cluster_input_numerics'): +│ │ If true then insert CheckNumerics nodes to check all cluster +│ │ inputs. +│ ├── Flag('tf_xla_check_cluster_output_numerics'): +│ │ If true then insert CheckNumerics nodes to check all cluster +│ │ outputs. +│ ├── Flag('tf_xla_disable_constant_folding'): +│ │ If true then disables constant folding on TF graph before XLA +│ │ compilation. +│ ├── Flag('tf_xla_disable_full_embedding_pipelining'): +│ │ If true then disables full embedding pipelining and instead use +│ │ strict SparseCore / TensorCore sequencing. +│ ├── Flag('tf_xla_embedding_parallel_iterations'): +│ │ If >0 then use this many parallel iterations in +│ │ embedding_pipelining and embedding_sequency. By default, use the +│ │ parallel_iterations on the original model WhileOp. +│ ├── Flag('tf_xla_compile_on_demand'): +│ │ Switch a device into 'on-demand' mode, where instead of +│ │ autoclustering ops are compiled one by one just-in-time. +│ ├── Flag('tf_xla_enable_xla_devices'): +│ │ Generate XLA_* devices, where placing a computation on such a +│ │ device forces compilation by XLA. Deprecated. +│ ├── Flag('tf_xla_always_defer_compilation') +│ ├── Flag('tf_xla_async_compilation'): +│ │ When lazy compilation is enabled, asynchronous compilation +│ │ starts the cluster compilation in the background, and the fallback path +│ │ is executed until the compilation has finished. +│ ├── Flag('tf_xla_use_device_api_for_xla_launch'): +│ │ If true, uses Device API (PjRt) for single device compilation +│ │ and execution of functions marked for JIT compilation i.e. +│ │ jit_compile=True. Defaults to false. +│ ├── Flag('tf_xla_use_device_api_for_compile_on_demand'): +│ │ If true, uses Device API (PjRt) for compiling and executing ops +│ │ one by one in 'on-demand' mode. Defaults to false. +│ ├── Flag('tf_xla_use_device_api_for_auto_jit'): +│ │ If true, uses Device API (PjRt) for compilation and execution +│ │ when auto-clustering is enabled. Defaults to false. +│ ├── Flag('tf_xla_use_device_api'): +│ │ If true, uses Device API (PjRt) for compilation and execution of +│ │ ops one-by-one in 'on-demand' mode, for functions marked for JIT +│ │ compilation, or when auto-clustering is enabled. Defaults to false. +│ ├── Flag('tf_xla_enable_device_api_for_gpu'): +│ │ If true, uses Device API (PjRt) for TF GPU device. This is a +│ │ helper flag so that individual tests can turn on PjRt for GPU +│ │ specifically. +│ ├── Flag('tf_xla_call_module_disabled_checks'): +│ │ A comma-sepated list of directives specifying the safety checks +│ │ to be skipped when compiling XlaCallModuleOp. See the op documentation +│ │ for the recognized values. +│ ├── Flag('tf_mlir_enable_mlir_bridge'): +│ │ Enables experimental MLIR-Based TensorFlow Compiler Bridge. +│ ├── Flag('tf_mlir_enable_merge_control_flow_pass'): +│ │ Enables MergeControlFlow pass for MLIR-Based TensorFlow Compiler +│ │ Bridge. +│ ├── Flag('tf_mlir_enable_convert_control_to_data_outputs_pass'): +│ │ Enables MLIR-Based TensorFlow Compiler Bridge. +│ ├── Flag('tf_mlir_enable_strict_clusters'): +│ │ Do not allow clusters that have cyclic control dependencies. +│ ├── Flag('tf_mlir_enable_multiple_local_cpu_devices'): +│ │ Enable multiple local CPU devices. CPU ops which are outside +│ │ compiled inside the tpu cluster will also be replicated across multiple +│ │ cpu devices. +│ ├── Flag('tf_dump_graphs_in_tfg'): +│ │ When tf_dump_graphs_in_tfg is true, graphs after transformations +│ │ are dumped in MLIR TFG dialect and not in GraphDef. +│ ├── Flag('tf_mlir_enable_generic_outside_compilation'): +│ │ Enables OutsideCompilation passes for MLIR-Based TensorFlow +│ │ Generic Compiler Bridge. +│ ├── Flag('tf_mlir_enable_tpu_variable_runtime_reformatting_pass'): +│ │ Enables TPUVariableRuntimeReformatting pass for MLIR-Based +│ │ TensorFlow Compiler Bridge. This enables weight update sharding and +│ │ creates TPUReshardVariables ops. +│ ├── TF_PY_DECLARE_FLAG('test_only_experiment_1') +│ ├── TF_PY_DECLARE_FLAG('test_only_experiment_2') +│ ├── TF_PY_DECLARE_FLAG('enable_nested_function_shape_inference') +│ ├── TF_PY_DECLARE_FLAG('enable_quantized_dtypes_training') +│ ├── TF_PY_DECLARE_FLAG('graph_building_optimization') +│ ├── TF_PY_DECLARE_FLAG('op_building_optimization') +│ ├── TF_PY_DECLARE_FLAG('saved_model_fingerprinting') +│ ├── TF_PY_DECLARE_FLAG('tf_shape_default_int64') +│ ├── TF_PY_DECLARE_FLAG('more_stack_traces') +│ ├── TF_PY_DECLARE_FLAG('publish_function_graphs') +│ ├── TF_PY_DECLARE_FLAG('enable_aggressive_constant_replication') +│ ├── TF_PY_DECLARE_FLAG('enable_colocation_key_propagation_in_while_op_loweri +│ │ ng') +│ ├── #define TENSORFLOW_CORE_CONFIG_FLAG_DEFS_H_ +│ ├── class Flags +│ ├── public: +│ ├── bool SetterForXlaAutoJitFlag(const string& value) +│ ├── bool SetterForXlaCallModuleDisabledChecks(const string& value) +│ ├── void AppendMarkForCompilationPassFlagsInternal(std::vector* +│ │ flag_list) +│ ├── void AllocateAndParseJitRtFlags() +│ ├── void AllocateAndParseFlags() +│ ├── void ResetFlags() +│ ├── bool SetXlaAutoJitFlagFromFlagString(const string& value) +│ ├── BuildXlaOpsPassFlags* GetBuildXlaOpsPassFlags() +│ ├── MarkForCompilationPassFlags* GetMarkForCompilationPassFlags() +│ ├── XlaSparseCoreFlags* GetXlaSparseCoreFlags() +│ ├── XlaDeviceFlags* GetXlaDeviceFlags() +│ ├── XlaOpsCommonFlags* GetXlaOpsCommonFlags() +│ ├── XlaCallModuleFlags* GetXlaCallModuleFlags() +│ ├── MlirCommonFlags* GetMlirCommonFlags() +│ ├── void ResetJitCompilerFlags() +│ ├── const JitRtFlags& GetJitRtFlags() +│ ├── ConfigProto::Experimental::MlirBridgeRollout GetMlirBridgeRolloutState( +│ │ std::optional config_proto) +│ ├── void AppendMarkForCompilationPassFlags(std::vector* flag_list) +│ ├── void DisableXlaCompilation() +│ ├── void EnableXlaCompilation() +│ ├── bool FailOnXlaCompilation() +│ ├── #define TF_PY_DECLARE_FLAG(flag_name) +│ └── PYBIND11_MODULE(flags_pybind, m) +├── 📄 test.f (181 tokens, 30 lines) +│ ├── MODULE basic_mod +│ ├── TYPE :: person +│ │ CHARACTER(LEN=50) :: name +│ │ INTEGER :: age +│ │ END TYPE person +│ ├── SUBROUTINE short_hello(happy, path) +│ │ END SUBROUTINE short_hello +│ ├── SUBROUTINE long_hello( +│ │ p, +│ │ message +│ │ ) +│ │ END SUBROUTINE long_hello +│ ├── END MODULE basic_mod +│ └── PROGRAM HelloFortran +│ END PROGRAM HelloFortran +├── 📄 torch.rst (60 tokens, 8 lines) +│ ├── # libtorch (C++-only) +│ └── - Building libtorch using Python +└── 📄 yc.html (9,063 tokens, 169 lines) diff --git a/tests/golden/legacy/trees/more_languages_group7.txt b/tests/golden/legacy/trees/more_languages_group7.txt new file mode 100644 index 0000000..1de0ae4 --- /dev/null +++ b/tests/golden/legacy/trees/more_languages_group7.txt @@ -0,0 +1,126 @@ +📁 group7 (1 folder, 5 files) +├── 📄 absurdly_huge.jsonl (8,347 tokens, 126 lines) +│ ├── SMILES: str +│ ├── Yield: float +│ ├── Temperature: int +│ ├── Pressure: float +│ ├── Solvent: str +│ ├── Success: bool +│ ├── Reaction_Conditions: dict +│ ├── Products: list +│ └── EdgeCasesMissed: None +├── 📄 angular_crud.ts (1,192 tokens, 148 lines) +│ ├── interface DBCommand +│ ├── export class IndexedDbService +│ ├── constructor() +│ ├── async create_connection({ db_name = 'client_db', table_name }: +│ │ DBCommand) +│ ├── upgrade(db) +│ ├── async create_model({ db_name, table_name, model }: DBCommand) +│ ├── verify_matching({ table_name, model }) +│ ├── async read_key({ db_name, table_name, key }: DBCommand) +│ ├── async update_model({ db_name, table_name, model }: DBCommand) +│ ├── verify_matching({ table_name, model }) +│ ├── async delete_key({ db_name, table_name, key }: DBCommand) +│ ├── async list_table({ +│ │ db_name, +│ │ table_name, +│ │ where, +│ │ }: DBCommand & { where?: { [key: string]: string | number } }) +│ └── async search_table(criteria: SearchCriteria) +├── 📄 structure.py (400 tokens, 92 lines) +│ ├── @runtime_checkable +│ │ class DataClass(Protocol) +│ ├── __dataclass_fields__: dict +│ ├── class MyInteger(Enum) +│ ├── ONE = 1 +│ ├── TWO = 2 +│ ├── THREE = 42 +│ ├── class MyString(Enum) +│ ├── AAA1 = "aaa" +│ ├── BB_B = """edge +│ │ case""" +│ ├── @dataclass(frozen=True, slots=True, kw_only=True) +│ │ class Tool +│ ├── name: str +│ ├── description: str +│ ├── input_model: DataClass +│ ├── output_model: DataClass +│ ├── def execute(self, *args, **kwargs) +│ ├── @property +│ │ def edge_case(self) -> str +│ ├── def should_still_see_me(self, x: bool = True) -> "Tool" +│ ├── @dataclass +│ │ class MyInput[T] +│ ├── name: str +│ ├── rank: MyInteger +│ ├── serial_n: int +│ ├── @dataclass +│ │ class Thingy +│ ├── is_edge_case: bool +│ ├── @dataclass +│ │ class MyOutput +│ ├── orders: str +│ ├── class MyTools(Enum) +│ ├── TOOL_A = Tool( +│ │ name="complicated", +│ │ description="edge case!", +│ │ input_model=MyInput[Thingy], +│ │ output_model=MyOutput, +│ │ ) +│ ├── TOOL_B = Tool( +│ │ name="""super +│ │ complicated +│ │ """, +│ │ description="edge case!", +│ │ input_model=MyInput, +│ │ output_model=MyOutput, +│ │ ) +│ ├── @final +│ │ class dtype(Generic[_DTypeScalar_co]) +│ └── names: None | tuple[builtins.str, ...] +├── 📄 test.wgsl (528 tokens, 87 lines) +│ ├── alias MyVec = vec4 +│ ├── alias AnotherVec = vec2 +│ ├── struct VertexInput +│ ├── struct VertexOutput +│ ├── struct MyUniforms +│ ├── @group(0) @binding(0) var u_mvp: mat4x4 +│ ├── @group(0) @binding(1) var u_color: MyVec +│ ├── @group(1) @binding(0) var my_texture: texture_2d +│ ├── @group(1) @binding(1) var my_sampler: sampler +│ ├── @vertex +│ │ fn vs_main(in: VertexInput) -> VertexOutput +│ ├── @fragment +│ │ fn fs_main(in: VertexOutput) -> @location(0) vec4 +│ ├── @compute @workgroup_size(8, 8, 1) +│ │ fn cs_main(@builtin(global_invocation_id) global_id: vec3) +│ ├── fn helper_function(val: f32) -> f32 +│ ├── struct AnotherStruct +│ └── @compute +│ @workgroup_size(8, 8, 1) +│ fn multi_line_edge_case( +│ @builtin(global_invocation_id) +│ globalId : vec3, +│ @group(1) +│ @binding(0) +│ srcTexture : texture_2d, +│ @group(1) +│ @binding(1) +│ srcSampler : sampler, +│ @group(0) +│ @binding(0) +│ uniformsPtr : ptr, +│ storageBuffer : ptr, 64>, read_write>, +│ ) +└── 📄 test.metal (272 tokens, 34 lines) + ├── struct MyData + ├── kernel void myKernel(device MyData* data [[buffer(0)]], + │ uint id [[thread_position_in_grid]]) + ├── float myHelperFunction(float x, float y) + ├── vertex float4 vertexShader(const device packed_float3* vertex_array + │ [[buffer(0)]], + │ unsigned int vid [[vertex_id]]) + ├── fragment half4 fragmentShader(float4 P [[position]]) + └── float3 computeNormalMap(ColorInOut in, texture2d + normalMapTexture) diff --git a/tests/golden/legacy/trees/more_languages_group_lisp.txt b/tests/golden/legacy/trees/more_languages_group_lisp.txt new file mode 100644 index 0000000..3aed61a --- /dev/null +++ b/tests/golden/legacy/trees/more_languages_group_lisp.txt @@ -0,0 +1,22 @@ +📁 group_lisp (1 folder, 4 files) +├── 📄 clojure_test.clj (682 tokens, 85 lines) +│ ├── defprotocol P +│ ├── defrecord Person +│ ├── defn -main +│ ├── ns bion.likes_trees +│ ├── def repo-url +│ ├── defn config +│ ├── defmacro with-os +│ └── defrecord SetFullElement +├── 📄 LispTest.lisp (25 tokens, 6 lines) +│ ├── defstruct person +│ └── defun greet +├── 📄 racket_struct.rkt (14 tokens, 1 line) +│ └── struct point +└── 📄 test_scheme.scm (360 tokens, 44 lines) + ├── define topological-sort + ├── define table + ├── define queue + ├── define result + ├── define set-up + └── define traverse diff --git a/tests/golden/legacy/trees/more_languages_group_todo.txt b/tests/golden/legacy/trees/more_languages_group_todo.txt new file mode 100644 index 0000000..b2d6fe7 --- /dev/null +++ b/tests/golden/legacy/trees/more_languages_group_todo.txt @@ -0,0 +1,111 @@ +📁 group_todo (1 folder, 12 files) +├── 📄 AAPLShaders.metal (5,780 tokens, 566 lines) +│ ├── struct LightingParameters +│ ├── float Geometry(float Ndotv, float alphaG) +│ ├── float3 computeNormalMap(ColorInOut in, texture2d +│ │ normalMapTexture) +│ ├── float3 computeDiffuse(LightingParameters parameters) +│ ├── float Distribution(float NdotH, float roughness) +│ ├── float3 computeSpecular(LightingParameters parameters) +│ ├── float4 equirectangularSample(float3 direction, sampler s, +│ │ texture2d image) +│ ├── LightingParameters calculateParameters(ColorInOut in, +│ │ AAPLCameraData cameraData, +│ │ constant AAPLLightData& +│ │ lightData, +│ │ texture2d baseColorMap, +│ │ texture2d normalMap, +│ │ texture2d metallicMap, +│ │ texture2d roughnessMap, +│ │ texture2d +│ │ ambientOcclusionMap, +│ │ texture2d skydomeMap) +│ ├── struct SkyboxVertex +│ ├── struct SkyboxV2F +│ ├── vertex SkyboxV2F skyboxVertex(SkyboxVertex in [[stage_in]], +│ │ constant AAPLCameraData& cameraData +│ │ [[buffer(BufferIndexCameraData)]]) +│ ├── fragment float4 skyboxFragment(SkyboxV2F v [[stage_in]], +│ │ texture2d skytexture [[texture(0)]]) +│ ├── vertex ColorInOut vertexShader(Vertex in [[stage_in]], +│ │ constant AAPLInstanceTransform& +│ │ instanceTransform [[ buffer(BufferIndexInstanceTransforms) ]], +│ │ constant AAPLCameraData& cameraData [[ +│ │ buffer(BufferIndexCameraData) ]]) +│ ├── float2 calculateScreenCoord( float3 ndcpos ) +│ ├── fragment float4 fragmentShader( +│ │ ColorInOut in +│ │ [[stage_in]], +│ │ constant AAPLCameraData& cameraData [[ +│ │ buffer(BufferIndexCameraData) ]], +│ │ constant AAPLLightData& lightData [[ +│ │ buffer(BufferIndexLightData) ]], +│ │ constant AAPLSubmeshKeypath&submeshKeypath [[ +│ │ buffer(BufferIndexSubmeshKeypath)]], +│ │ constant Scene* pScene [[ +│ │ buffer(SceneIndex)]], +│ │ texture2d skydomeMap [[ +│ │ texture(AAPLSkyDomeTexture) ]], +│ │ texture2d rtReflections [[ +│ │ texture(AAPLTextureIndexReflections), +│ │ function_constant(is_raytracing_enabled)]]) +│ ├── fragment float4 reflectionShader(ColorInOut in [[stage_in]], +│ │ texture2d rtReflections +│ │ [[texture(AAPLTextureIndexReflections)]]) +│ ├── struct ThinGBufferOut +│ ├── fragment ThinGBufferOut gBufferFragmentShader(ColorInOut in +│ │ [[stage_in]]) +│ ├── kernel void rtReflection( +│ │ texture2d< float, access::write > outImage +│ │ [[texture(OutImageIndex)]], +│ │ texture2d< float > positions +│ │ [[texture(ThinGBufferPositionIndex)]], +│ │ texture2d< float > directions +│ │ [[texture(ThinGBufferDirectionIndex)]], +│ │ texture2d< float > skydomeMap +│ │ [[texture(AAPLSkyDomeTexture)]], +│ │ constant AAPLInstanceTransform* instanceTransforms +│ │ [[buffer(BufferIndexInstanceTransforms)]], +│ │ constant AAPLCameraData& cameraData +│ │ [[buffer(BufferIndexCameraData)]], +│ │ constant AAPLLightData& lightData +│ │ [[buffer(BufferIndexLightData)]], +│ │ constant Scene* pScene +│ │ [[buffer(SceneIndex)]], +│ │ instance_acceleration_structure +│ │ accelerationStructure [[buffer(AccelerationStructureIndex)]], +│ │ uint2 tid [[thread_position_in_grid]]) +│ ├── else if ( intersection.type == raytracing::intersection_type::none ) +│ ├── struct VertexInOut +│ ├── vertex VertexInOut vertexPassthrough( uint vid [[vertex_id]] ) +│ ├── fragment float4 fragmentPassthrough( VertexInOut in [[stage_in]], +│ │ texture2d< float > tin ) +│ ├── fragment float4 fragmentBloomThreshold( VertexInOut in [[stage_in]], +│ │ texture2d< float > tin +│ │ [[texture(0)]], +│ │ constant float* threshold +│ │ [[buffer(0)]] ) +│ └── fragment float4 fragmentPostprocessMerge( VertexInOut in [[stage_in]], +│ constant float& exposure +│ [[buffer(0)]], +│ texture2d< float > texture0 +│ [[texture(0)]], +│ texture2d< float > texture1 +│ [[texture(1)]]) +├── 📄 crystal_test.cr (48 tokens, 15 lines) +├── 📄 dart_test.dart (108 tokens, 24 lines) +├── 📄 elixir_test.exs (39 tokens, 10 lines) +├── 📄 forward.frag (739 tokens, 87 lines) +├── 📄 forward.vert (359 tokens, 48 lines) +├── 📄 nodemon.json (118 tokens, 20 lines) +├── 📄 sas_test.sas (97 tokens, 22 lines) +├── 📄 test_setup_py.test (133 tokens, 24 lines) +├── 📄 testTypings.d.ts (158 tokens, 23 lines) +├── 📄 vba_test.bas (67 tokens, 16 lines) +└── 📄 wgsl_test.wgsl (94 tokens, 17 lines) + ├── @binding(0) @group(0) var frame : u32 + ├── @vertex + │ fn vtx_main(@builtin(vertex_index) vertex_index : u32) -> + │ @builtin(position) vec4f + └── @fragment + fn frag_main() -> @location(0) vec4f diff --git a/tests/golden/legacy/trees/multi_seed.txt b/tests/golden/legacy/trees/multi_seed.txt new file mode 100644 index 0000000..e1fb21c --- /dev/null +++ b/tests/golden/legacy/trees/multi_seed.txt @@ -0,0 +1,427 @@ +🌵 Root (2 folders, 17 files) +├── 📁 group1 (1 folder, 11 files) +│ ├── 📄 addamt.cobol (441 tokens, 40 lines) +│ │ ├── IDENTIFICATION DIVISION. +│ │ ├── PROGRAM-ID. +│ │ │ ADDAMT. +│ │ ├── DATA DIVISION. +│ │ ├── WORKING-STORAGE SECTION. +│ │ ├── 01 KEYED-INPUT. +│ │ ├── 05 CUST-NO-IN. +│ │ ├── 05 AMT1-IN. +│ │ ├── 05 AMT2-IN. +│ │ ├── 05 AMT3-IN. +│ │ ├── 01 DISPLAYED-OUTPUT. +│ │ ├── 05 CUST-NO-OUT. +│ │ ├── 05 TOTAL-OUT. +│ │ ├── 01 MORE-DATA. +│ │ ├── PROCEDURE DIVISION. +│ │ └── 100-MAIN. +│ ├── 📄 CUSTOMER-INVOICE.CBL (412 tokens, 60 lines) +│ │ ├── IDENTIFICATION DIVISION. +│ │ ├── PROGRAM-ID. CUSTOMER-INVOICE. +│ │ ├── AUTHOR. JANE DOE. +│ │ ├── DATE. 2023-12-30. +│ │ ├── DATE-COMPILED. 06/30/10. +│ │ ├── DATE-WRITTEN. 12/34/56. +│ │ ├── ENVIRONMENT DIVISION. +│ │ ├── INPUT-OUTPUT SECTION. +│ │ ├── FILE-CONTROL. +│ │ ├── SELECT CUSTOMER-FILE. +│ │ ├── SELECT INVOICE-FILE. +│ │ ├── SELECT REPORT-FILE. +│ │ ├── DATA DIVISION. +│ │ ├── FILE SECTION. +│ │ ├── FD CUSTOMER-FILE. +│ │ ├── 01 CUSTOMER-RECORD. +│ │ ├── 05 CUSTOMER-ID. +│ │ ├── 05 CUSTOMER-NAME. +│ │ ├── 05 CUSTOMER-BALANCE. +│ │ ├── FD INVOICE-FILE. +│ │ ├── 01 INVOICE-RECORD. +│ │ ├── 05 INVOICE-ID. +│ │ ├── 05 CUSTOMER-ID. +│ │ ├── 05 INVOICE-AMOUNT. +│ │ ├── FD REPORT-FILE. +│ │ ├── 01 REPORT-RECORD. +│ │ ├── WORKING-STORAGE SECTION. +│ │ ├── 01 WS-CUSTOMER-FOUND. +│ │ ├── 01 WS-END-OF-FILE. +│ │ ├── 01 WS-TOTAL-BALANCE. +│ │ ├── PROCEDURE DIVISION. +│ │ ├── 0000-MAIN-ROUTINE. +│ │ ├── 1000-PROCESS-RECORDS. +│ │ ├── 1100-UPDATE-CUSTOMER-BALANCE. +│ │ └── END PROGRAM CUSTOMER-INVOICE. +│ ├── 📄 JavaTest.java (578 tokens, 86 lines) +│ │ ├── abstract class LivingBeing +│ │ ├── abstract void breathe() +│ │ ├── interface Communicator +│ │ ├── String communicate() +│ │ ├── @Log +│ │ ├── @Getter +│ │ ├── @Setter +│ │ ├── class Person extends LivingBeing implements Communicator +│ │ ├── Person(String name, int age) +│ │ ├── @Override +│ │ ├── void breathe() +│ │ ├── @Override +│ │ ├── public String communicate() +│ │ ├── void greet() +│ │ ├── String personalizedGreeting(String greeting, Optional +│ │ │ includeAge) +│ │ ├── @Singleton +│ │ ├── @RestController +│ │ ├── @SpringBootApplication +│ │ ├── public class Example +│ │ ├── @Inject +│ │ ├── public Example(Person person) +│ │ ├── @RequestMapping("/greet") +│ │ ├── String home(@RequestParam(value = "name", defaultValue = +│ │ │ "World") String name, +│ │ │ @RequestParam(value = "age", defaultValue = "30") +│ │ │ int age) +│ │ └── public static void main(String[] args) +│ ├── 📄 JuliaTest.jl (381 tokens, 63 lines) +│ │ ├── module JuliaTest_EdgeCase +│ │ ├── struct Location +│ │ │ name::String +│ │ │ lat::Float32 +│ │ │ lon::Float32 +│ │ │ end +│ │ ├── mutable struct mPerson +│ │ │ name::String +│ │ │ age::Int +│ │ │ end +│ │ ├── Base.@kwdef mutable struct Param +│ │ │ Δt::Float64 = 0.1 +│ │ │ n::Int64 +│ │ │ m::Int64 +│ │ │ end +│ │ ├── sic(x,y) +│ │ ├── welcome(l::Location) +│ │ ├── ∑(α, Ω) +│ │ ├── function noob() +│ │ │ end +│ │ ├── function ye_olde(hello::String, world::Location) +│ │ │ end +│ │ ├── function multiline_greet( +│ │ │ p::mPerson, +│ │ │ greeting::String +│ │ │ ) +│ │ │ end +│ │ ├── function julia_is_awesome(prob::DiffEqBase.AbstractDAEProblem{uType, +│ │ │ duType, tType, +│ │ │ isinplace}; +│ │ │ kwargs...) where {uType, duType, tType, isinplace} +│ │ │ end +│ │ └── end +│ ├── 📄 KotlinTest.kt (974 tokens, 171 lines) +│ │ ├── data class Person(val name: String) +│ │ ├── fun greet(person: Person) +│ │ ├── fun processItems(items: List, processor: (T) -> Unit) +│ │ ├── interface Source +│ │ ├── fun nextT(): T +│ │ ├── fun MutableList.swap(index1: Int, index2: Int) +│ │ ├── fun Any?.toString(): String +│ │ ├── tailrec fun findFixPoint(x: Double = 1.0): Double +│ │ ├── class GenericRepository +│ │ ├── fun getItem(id: Int): T? +│ │ ├── sealed interface Error +│ │ ├── sealed class IOError(): Error +│ │ ├── object Runner +│ │ ├── inline fun , T> run() : T +│ │ ├── infix fun Int.shl(x: Int): Int +│ │ ├── class MyStringCollection +│ │ ├── infix fun add(s: String) +│ │ ├── fun build() +│ │ ├── open class Base(p: Int) +│ │ ├── class Derived(p: Int) : Base(p) +│ │ ├── open class Shape +│ │ ├── open fun draw() +│ │ ├── fun fill() +│ │ ├── open fun edge(case: Int) +│ │ ├── interface Thingy +│ │ ├── fun edge() +│ │ ├── class Circle() : Shape(), Thingy +│ │ ├── override fun draw() +│ │ ├── final override fun edge(case: Int) +│ │ ├── interface Base +│ │ ├── fun print() +│ │ ├── class BaseImpl(val x: Int) : Base +│ │ ├── override fun print() +│ │ ├── internal class Derived(b: Base) : Base by b +│ │ ├── class Person constructor(firstName: String) +│ │ ├── class People( +│ │ │ firstNames: Array, +│ │ │ ages: Array(42), +│ │ │ ) +│ │ ├── fun edgeCases(): Boolean +│ │ ├── class Alien public @Inject constructor( +│ │ │ val firstName: String, +│ │ │ val lastName: String, +│ │ │ var age: Int, +│ │ │ val pets: MutableList = mutableListOf(), +│ │ │ ) +│ │ ├── fun objectOriented(): String +│ │ ├── enum class IntArithmetics : BinaryOperator, IntBinaryOperator +│ │ ├── PLUS { +│ │ │ override fun apply(t: Int, u: Int): Int +│ │ ├── TIMES { +│ │ │ override fun apply(t: Int, u: Int): Int +│ │ ├── override fun applyAsInt(t: Int, u: Int) +│ │ ├── fun reformat( +│ │ │ str: String, +│ │ │ normalizeCase: Boolean = true, +│ │ │ upperCaseFirstLetter: Boolean = true, +│ │ │ divideByCamelHumps: Boolean = false, +│ │ │ wordSeparator: Char = ' ', +│ │ │ ) +│ │ ├── operator fun Point.unaryMinus() +│ │ ├── abstract class Polygon +│ │ └── abstract fun draw() +│ ├── 📄 lesson.cbl (635 tokens, 78 lines) +│ │ ├── IDENTIFICATION DIVISION. +│ │ ├── PROGRAM-ID. CBL0002. +│ │ ├── AUTHOR. Otto B. Fun. +│ │ ├── ENVIRONMENT DIVISION. +│ │ ├── INPUT-OUTPUT SECTION. +│ │ ├── FILE-CONTROL. +│ │ ├── SELECT PRINT-LINE. +│ │ ├── SELECT ACCT-REC. +│ │ ├── DATA DIVISION. +│ │ ├── FILE SECTION. +│ │ ├── FD PRINT-LINE. +│ │ ├── 01 PRINT-REC. +│ │ ├── 05 ACCT-NO-O. +│ │ ├── 05 ACCT-LIMIT-O. +│ │ ├── 05 ACCT-BALANCE-O. +│ │ ├── 05 LAST-NAME-O. +│ │ ├── 05 FIRST-NAME-O. +│ │ ├── 05 COMMENTS-O. +│ │ ├── FD ACCT-REC. +│ │ ├── 01 ACCT-FIELDS. +│ │ ├── 05 ACCT-NO. +│ │ ├── 05 ACCT-LIMIT. +│ │ ├── 05 ACCT-BALANCE. +│ │ ├── 05 LAST-NAME. +│ │ ├── 05 FIRST-NAME. +│ │ ├── 05 CLIENT-ADDR. +│ │ ├── 10 STREET-ADDR. +│ │ ├── 10 CITY-COUNTY. +│ │ ├── 10 USA-STATE. +│ │ ├── 05 RESERVED. +│ │ ├── 05 COMMENTS. +│ │ ├── WORKING-STORAGE SECTION. +│ │ ├── 01 FLAGS. +│ │ ├── 05 LASTREC. +│ │ ├── PROCEDURE DIVISION. +│ │ ├── OPEN-FILES. +│ │ ├── READ-NEXT-RECORD. +│ │ ├── CLOSE-STOP. +│ │ ├── READ-RECORD. +│ │ └── WRITE-RECORD. +│ ├── 📄 LuaTest.lua (83 tokens, 16 lines) +│ │ ├── function HelloWorld.new +│ │ ├── function HelloWorld.greet +│ │ └── function say_hello +│ ├── 📄 ObjectiveCTest.m (62 tokens, 16 lines) +│ │ ├── @interface HelloWorld +│ │ ├── @interface HelloWorld -> (void) sayHello +│ │ ├── @implementation HelloWorld +│ │ ├── @implementation HelloWorld -> (void) sayHello +│ │ └── void sayHelloWorld() +│ ├── 📄 OcamlTest.ml (49 tokens, 12 lines) +│ │ ├── type color +│ │ ├── class hello +│ │ ├── class hello -> method say_hello +│ │ └── let main () +│ ├── 📄 test.js (757 tokens, 154 lines) +│ │ ├── class MyClass +│ │ ├── myMethod() +│ │ ├── async asyncMethod(a, b) +│ │ ├── methodWithDefaultParameters(a = 5, b = 10) +│ │ ├── multilineMethod( +│ │ │ c, +│ │ │ d +│ │ │ ) +│ │ ├── multilineMethodWithDefaults( +│ │ │ t = "tree", +│ │ │ p = "plus" +│ │ │ ) +│ │ ├── function myFunction(param1, param2) +│ │ ├── function multilineFunction( +│ │ │ param1, +│ │ │ param2 +│ │ │ ) +│ │ ├── const arrowFunction = () => +│ │ ├── const parametricArrow = (a, b) => +│ │ ├── function () +│ │ ├── function outerFunction(outerParam) +│ │ ├── function innerFunction(innerParam) +│ │ ├── innerFunction("inner") +│ │ ├── const myObject = { +│ │ ├── myMethod: function (stuff) +│ │ ├── let myArrowObject = { +│ │ ├── myArrow: ({ +│ │ │ a, +│ │ │ b, +│ │ │ c, +│ │ │ }) => +│ │ ├── const myAsyncArrowFunction = async () => +│ │ ├── function functionWithRestParameters(...args) +│ │ ├── const namedFunctionExpression = function myNamedFunction() +│ │ ├── const multilineArrowFunction = ( +│ │ │ a, +│ │ │ b +│ │ │ ) => +│ │ ├── function functionReturningFunction() +│ │ ├── return function () +│ │ ├── function destructuringOnMultipleLines({ +│ │ │ a, +│ │ │ b, +│ │ │ }) +│ │ ├── const arrowFunctionWithDestructuring = ({ a, b }) => +│ │ ├── const multilineDestructuringArrow = ({ +│ │ │ a, +│ │ │ b, +│ │ │ }) => +│ │ ├── async function asyncFunctionWithErrorHandling() +│ │ ├── class Car +│ │ ├── constructor(brand) +│ │ ├── present() +│ │ ├── class Model extends Car +│ │ ├── constructor(brand, mod) +│ │ ├── super(brand) +│ │ └── show() +│ └── 📄 test.ts (832 tokens, 165 lines) +│ ├── type MyType +│ ├── interface MyInterface +│ ├── class TsClass +│ ├── myMethod() +│ ├── myMethodWithArgs(param1: string, param2: number): void +│ ├── static myStaticMethod(param: T): T +│ ├── multilineMethod( +│ │ c: number, +│ │ d: number +│ │ ): number +│ ├── multilineMethodWithDefaults( +│ │ t: string = "tree", +│ │ p: string = "plus" +│ │ ): string +│ ├── export class AdvancedComponent implements MyInterface +│ ├── async myAsyncMethod( +│ │ a: string, +│ │ b: number, +│ │ c: string +│ │ ): Promise +│ ├── genericMethod( +│ │ arg1: T, +│ │ arg2: U +│ │ ): [T, U] +│ ├── export class TicketsComponent implements MyInterface +│ ├── async myAsyncMethod({ a, b, c }: { a: String; b: Number; c: String +│ │ }) +│ ├── function tsFunction() +│ ├── function tsFunctionSigned( +│ │ param1: number, +│ │ param2: number +│ │ ): void +│ ├── export default async function tsFunctionComplicated({ +│ │ a = 1 | 2, +│ │ b = "bob", +│ │ c = async () => "charlie", +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ c: () => Promise; +│ │ }): Promise +│ ├── return("Standalone function with parameters") +│ ├── const tsArrowFunctionSigned = ({ +│ │ a, +│ │ b, +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ }) => +│ ├── export const tsComplicatedArrow = async ({ +│ │ a = 1 | 2, +│ │ b = "bob", +│ │ c = async () => "charlie", +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ c: () => Promise; +│ │ }): Promise => +│ ├── const arrowFunction = () => +│ ├── const arrow = (a: String, b: Number) => +│ ├── const asyncArrowFunction = async () => +│ ├── const asyncArrow = async (a: String, b: Number) => +│ ├── let weirdArrow = () => +│ ├── const asyncPromiseArrow = async (): Promise => +│ ├── let myWeirdArrowSigned = (x: number): number => +│ ├── class Person +│ ├── constructor(private firstName: string, private lastName: string) +│ ├── getFullName(): string +│ ├── describe(): string +│ ├── class Employee extends Person +│ ├── constructor( +│ │ firstName: string, +│ │ lastName: string, +│ │ private jobTitle: string +│ │ ) +│ ├── super(firstName, lastName) +│ ├── describe(): string +│ ├── interface Shape +│ └── interface Square extends Shape +└── 📁 path_to_test (1 folder, 6 files) + ├── 📄 class_method_type.py (525 tokens, 101 lines) + │ ├── T = TypeVar("T") + │ ├── def parse_py(contents: str) -> List[str] + │ ├── class MyClass + │ ├── @staticmethod + │ │ def physical_element_aval(dtype) -> core.ShapedArray + │ ├── def my_method(self) + │ ├── @staticmethod + │ │ def my_typed_method(obj: dict) -> int + │ ├── def my_multiline_signature_method( + │ │ self, + │ │ alice: str = None, + │ │ bob: int = None, + │ │ ) -> tuple + │ ├── @lru_cache(maxsize=None) + │ │ def my_multiline_signature_function( + │ │ tree: tuple = (), + │ │ plus: str = "+", + │ │ ) -> tuple + │ ├── class LogLevelEnum(str, Enum) + │ ├── CRITICAL = "CRITICAL" + │ ├── GREETING = "GREETING" + │ ├── WARNING = "WARNING" + │ ├── ERROR = "ERROR" + │ ├── DEBUG = "DEBUG" + │ ├── INFO = "INFO" + │ ├── OFF = "OFF" + │ ├── class Thingy(BaseModel) + │ ├── metric: float + │ ├── @dataclass + │ │ class TestDataclass + │ ├── tree: str + │ ├── A = TypeVar("A", str, bytes) + │ ├── def omega_yikes(file: str, expected: List[str]) -> bool + │ ├── def ice[T](args: Iterable[T] = ()) + │ ├── class list[T] + │ ├── def __getitem__(self, index: int, /) -> T + │ ├── @classmethod + │ │ def from_code(cls, toolbox, code: bytes, score=None) -> "Thingy" + │ ├── @classmethod + │ │ def from_str(cls, toolbox, string: str, score=None) -> "Thingy" + │ └── class Router(hk.Module) + ├── 📄 empty.py (0 tokens, 0 lines) + ├── 📄 file.md (11 tokens, 2 lines) + │ └── # Hello, world! + ├── 📄 file.py (18 tokens, 3 lines) + │ └── def hello_world() + ├── 📄 file.txt (10 tokens, 2 lines) + └── 📄 version.py (13 tokens, 2 lines) + └── __version__ = "1.2.3" diff --git a/tests/golden/legacy/trees/path_to_test.txt b/tests/golden/legacy/trees/path_to_test.txt new file mode 100644 index 0000000..c1f5bc7 --- /dev/null +++ b/tests/golden/legacy/trees/path_to_test.txt @@ -0,0 +1,51 @@ +📁 path_to_test (1 folder, 6 files) +├── 📄 class_method_type.py (525 tokens, 101 lines) +│ ├── T = TypeVar("T") +│ ├── def parse_py(contents: str) -> List[str] +│ ├── class MyClass +│ ├── @staticmethod +│ │ def physical_element_aval(dtype) -> core.ShapedArray +│ ├── def my_method(self) +│ ├── @staticmethod +│ │ def my_typed_method(obj: dict) -> int +│ ├── def my_multiline_signature_method( +│ │ self, +│ │ alice: str = None, +│ │ bob: int = None, +│ │ ) -> tuple +│ ├── @lru_cache(maxsize=None) +│ │ def my_multiline_signature_function( +│ │ tree: tuple = (), +│ │ plus: str = "+", +│ │ ) -> tuple +│ ├── class LogLevelEnum(str, Enum) +│ ├── CRITICAL = "CRITICAL" +│ ├── GREETING = "GREETING" +│ ├── WARNING = "WARNING" +│ ├── ERROR = "ERROR" +│ ├── DEBUG = "DEBUG" +│ ├── INFO = "INFO" +│ ├── OFF = "OFF" +│ ├── class Thingy(BaseModel) +│ ├── metric: float +│ ├── @dataclass +│ │ class TestDataclass +│ ├── tree: str +│ ├── A = TypeVar("A", str, bytes) +│ ├── def omega_yikes(file: str, expected: List[str]) -> bool +│ ├── def ice[T](args: Iterable[T] = ()) +│ ├── class list[T] +│ ├── def __getitem__(self, index: int, /) -> T +│ ├── @classmethod +│ │ def from_code(cls, toolbox, code: bytes, score=None) -> "Thingy" +│ ├── @classmethod +│ │ def from_str(cls, toolbox, string: str, score=None) -> "Thingy" +│ └── class Router(hk.Module) +├── 📄 empty.py (0 tokens, 0 lines) +├── 📄 file.md (11 tokens, 2 lines) +│ └── # Hello, world! +├── 📄 file.py (18 tokens, 3 lines) +│ └── def hello_world() +├── 📄 file.txt (10 tokens, 2 lines) +└── 📄 version.py (13 tokens, 2 lines) + └── __version__ = "1.2.3" diff --git a/tests/golden/legacy/trees/repo_concise.txt b/tests/golden/legacy/trees/repo_concise.txt new file mode 100644 index 0000000..17182cb --- /dev/null +++ b/tests/golden/legacy/trees/repo_concise.txt @@ -0,0 +1,699 @@ +📁 tree_plus (44 folders, 439 files) +├── 📄 .env.test (4 tokens, 0 lines) +├── 📁 .github (2 folders, 3 files) +│ ├── 📄 dependabot.yml (128 tokens, 11 lines) +│ └── 📁 workflows (1 folder, 2 files) +│ ├── 📄 microsoft.yml (284 tokens, 40 lines) +│ └── 📄 unix.yml (713 tokens, 92 lines) +├── 📄 .gitignore (219 tokens, 57 lines) +├── 📄 .mcp_server.pid (2 tokens, 1 line) +├── 📄 Cargo.toml (206 tokens, 29 lines) +├── 📄 claude-fable-5-rust-rewrite-goal.md (3,394 tokens, 434 lines) +├── 📁 coverage (1 folder, 1 file) +│ └── 📄 lcov.info (17,359 tokens, 2,180 lines) +├── 📁 crates (11 folders, 27 files) +│ ├── 📁 tree_plus_cli (3 folders, 3 files) +│ │ ├── 📄 Cargo.toml (86 tokens, 15 lines) +│ │ ├── 📁 src (1 folder, 1 file) +│ │ │ └── 📄 main.rs (1,332 tokens, 174 lines) +│ │ └── 📁 tests (1 folder, 1 file) +│ │ └── 📄 cli.rs (701 tokens, 92 lines) +│ └── 📁 tree_plus_core (7 folders, 24 files) +│ ├── 📁 benches (1 folder, 1 file) +│ │ └── 📄 tree_plus_bench.rs (608 tokens, 78 lines) +│ ├── 📄 Cargo.toml (228 tokens, 36 lines) +│ ├── 📁 examples (1 folder, 2 files) +│ │ ├── 📄 dump_ast.rs (516 tokens, 55 lines) +│ │ └── 📄 extract.rs (129 tokens, 16 lines) +│ ├── 📁 src (3 folders, 18 files) +│ │ ├── 📄 config.rs (304 tokens, 39 lines) +│ │ ├── 📄 count.rs (1,346 tokens, 203 lines) +│ │ ├── 📁 extract (2 folders, 10 files) +│ │ │ ├── 📄 data.rs (5,115 tokens, 582 lines) +│ │ │ ├── 📄 markdown.rs (1,531 tokens, 180 lines) +│ │ │ ├── 📄 markers.rs (438 tokens, 60 lines) +│ │ │ ├── 📄 mod.rs (2,520 tokens, 277 lines) +│ │ │ ├── 📄 simple.rs (1,629 tokens, 216 lines) +│ │ │ └── 📁 treesitter (1 folder, 5 files) +│ │ │ ├── 📄 c_cpp.rs (5,979 tokens, 591 lines) +│ │ │ ├── 📄 mod.rs (488 tokens, 66 lines) +│ │ │ ├── 📄 python.rs (3,487 tokens, 346 lines) +│ │ │ ├── 📄 rust.rs (2,785 tokens, 312 lines) +│ │ │ └── 📄 typescript.rs (3,897 tokens, 420 lines) +│ │ ├── 📄 ignore.rs (2,144 tokens, 307 lines) +│ │ ├── 📄 lib.rs (222 tokens, 29 lines) +│ │ ├── 📄 model.rs (928 tokens, 125 lines) +│ │ ├── 📄 render.rs (2,741 tokens, 347 lines) +│ │ ├── 📄 sort.rs (1,693 tokens, 214 lines) +│ │ └── 📄 walk.rs (2,245 tokens, 260 lines) +│ └── 📁 tests (1 folder, 2 files) +│ ├── 📄 golden_parity.rs (1,809 tokens, 243 lines) +│ └── 📄 robustness.rs (743 tokens, 86 lines) +├── 📁 docs (1 folder, 4 files) +│ ├── 📄 architecture.md (1,392 tokens, 113 lines) +│ ├── 📄 language-roadmap.md (1,262 tokens, 64 lines) +│ ├── 📄 performance.md (690 tokens, 64 lines) +│ └── 📄 rust-port-differences.md (922 tokens, 67 lines) +├── 📄 LICENSE (2,744 tokens, 81 lines) +├── 📄 Makefile (801 tokens, 121 lines) +├── 📄 nodemon.json (112 tokens, 24 lines) +├── 📄 pyproject.toml (366 tokens, 51 lines) +├── 📄 pytest.ini (20 tokens, 4 lines) +├── 📄 README.md (99,851 tokens, 3,708 lines) +├── 📁 tests (25 folders, 378 files) +│ ├── 📄 .env.test (4 tokens, 0 lines) +│ ├── 📄 build_absurdly_huge_jsonl.py (506 tokens, 65 lines) +│ ├── 📁 dot_dot (2 folders, 4 files) +│ │ ├── 📄 my_test_file.py (7 tokens, 2 lines) +│ │ └── 📁 nested_dir (1 folder, 3 files) +│ │ ├── 📄 .env.test (4 tokens, 0 lines) +│ │ ├── 📄 pytest.ini (20 tokens, 4 lines) +│ │ └── 📄 test_tp_dotdot.py (362 tokens, 52 lines) +│ ├── 📁 empty_folder (2 folders, 0 files) +│ │ └── 📁 is_empty (1 folder, 0 files) +│ ├── 📁 folder_with_evil_logging (1 folder, 1 file) +│ │ └── 📄 logging.py (11 tokens, 0 lines) +│ ├── 📁 golden (6 folders, 246 files) +│ │ ├── 📄 diff_components.py (417 tokens, 59 lines) +│ │ ├── 📄 generate_legacy_goldens.py (1,499 tokens, 156 lines) +│ │ └── 📁 legacy (5 folders, 244 files) +│ │ ├── 📁 components (1 folder, 109 files) +│ │ │ ├── 📄 tests__dot_dot__my_test_file.py.json (21 tokens, 5 lines) +│ │ │ ├── 📄 tests__dot_dot__nested_dir__.env.test.json (22 tokens, 5 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__dot_dot__nested_dir__pytest.ini.json (17 tokens, 3 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__dot_dot__nested_dir__test_tp_dotdot.py.json (40 +│ │ │ │ tokens, 6 lines) +│ │ │ ├── 📄 tests__more_languages__group1__addamt.cobol.json (110 +│ │ │ │ tokens, 19 lines) +│ │ │ ├── 📄 tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json +│ │ │ │ (251 tokens, 39 lines) +│ │ │ ├── 📄 tests__more_languages__group1__JavaTest.java.json (236 +│ │ │ │ tokens, 28 lines) +│ │ │ ├── 📄 tests__more_languages__group1__JuliaTest.jl.json (189 +│ │ │ │ tokens, 16 lines) +│ │ │ ├── 📄 tests__more_languages__group1__KotlinTest.kt.json (533 +│ │ │ │ tokens, 51 lines) +│ │ │ ├── 📄 tests__more_languages__group1__lesson.cbl.json (259 +│ │ │ │ tokens, 44 lines) +│ │ │ ├── 📄 tests__more_languages__group1__LuaTest.lua.json (39 +│ │ │ │ tokens, 7 lines) +│ │ │ ├── 📄 tests__more_languages__group1__ObjectiveCTest.m.json (65 +│ │ │ │ tokens, 9 lines) +│ │ │ ├── 📄 tests__more_languages__group1__OcamlTest.ml.json (40 +│ │ │ │ tokens, 8 lines) +│ │ │ ├── 📄 tests__more_languages__group1__test.js.json (355 tokens, +│ │ │ │ 39 lines) +│ │ │ ├── 📄 tests__more_languages__group1__test.ts.json (541 tokens, +│ │ │ │ 40 lines) +│ │ │ ├── 📄 tests__more_languages__group2__apl_test.apl.json (49 +│ │ │ │ tokens, 7 lines) +│ │ │ ├── 📄 tests__more_languages__group2__c_test.c.json (290 tokens, +│ │ │ │ 38 lines) +│ │ │ ├── 📄 tests__more_languages__group2__go_test.go.json (111 +│ │ │ │ tokens, 11 lines) +│ │ │ ├── 📄 tests__more_languages__group2__PerlTest.pl.json (50 +│ │ │ │ tokens, 8 lines) +│ │ │ ├── 📄 tests__more_languages__group2__PhpTest.php.json (54 +│ │ │ │ tokens, 9 lines) +│ │ │ ├── 📄 tests__more_languages__group2__PowershellTest.ps1.json +│ │ │ │ (315 tokens, 26 lines) +│ │ │ ├── 📄 tests__more_languages__group2__ScalaTest.scala.json (134 +│ │ │ │ tokens, 15 lines) +│ │ │ ├── 📄 tests__more_languages__group2__test.csv.json (31 tokens, +│ │ │ │ 9 lines) +│ │ │ ├── 📄 tests__more_languages__group3__bash_test.sh.json (55 +│ │ │ │ tokens, 10 lines) +│ │ │ ├── 📄 tests__more_languages__group3__cpp_test.cpp.json (591 +│ │ │ │ tokens, 73 lines) +│ │ │ ├── 📄 tests__more_languages__group3__csharp_test.cs.json (592 +│ │ │ │ tokens, 47 lines) +│ │ │ ├── 📄 tests__more_languages__group3__hallucination.tex.json +│ │ │ │ (299 tokens, 29 lines) +│ │ │ ├── 📄 tests__more_languages__group3__ruby_test.rb.json (96 +│ │ │ │ tokens, 13 lines) +│ │ │ ├── 📄 tests__more_languages__group3__swift_test.swift.json (200 +│ │ │ │ tokens, 22 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.lean.json (151 +│ │ │ │ tokens, 14 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.capnp.json (124 +│ │ │ │ tokens, 22 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.graphql.json (102 +│ │ │ │ tokens, 18 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.proto.json (159 +│ │ │ │ tokens, 22 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.sqlite.json (73 +│ │ │ │ tokens, 12 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test_Cargo.toml.json (69 +│ │ │ │ tokens, 11 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group3__test_json_rpc_2_0.json.json +│ │ │ │ (48 tokens, 10 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test_openapi.yaml.json +│ │ │ │ (120 tokens, 15 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test_openrpc.json.json (78 +│ │ │ │ tokens, 13 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test_pyproject.toml.json +│ │ │ │ (99 tokens, 15 lines) +│ │ │ ├── 📄 tests__more_languages__group4__erl_test.erl.json (212 +│ │ │ │ tokens, 19 lines) +│ │ │ ├── 📄 tests__more_languages__group4__haskell_test.hs.json (121 +│ │ │ │ tokens, 7 lines) +│ │ │ ├── 📄 tests__more_languages__group4__mathematica_test.nb.json +│ │ │ │ (35 tokens, 7 lines) +│ │ │ ├── 📄 tests__more_languages__group4__matlab_test.m.json (35 +│ │ │ │ tokens, 6 lines) +│ │ │ ├── 📄 tests__more_languages__group4__RTest.R.json (50 tokens, 8 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__more_languages__group4__rust_test.rs.json (775 +│ │ │ │ tokens, 49 lines) +│ │ │ ├── 📄 tests__more_languages__group4__test.zig.json (139 tokens, +│ │ │ │ 10 lines) +│ │ │ ├── 📄 tests__more_languages__group4__test_fsharp.fs.json (80 +│ │ │ │ tokens, 10 lines) +│ │ │ ├── 📄 tests__more_languages__group4__test_tcl_tk.tcl.json (41 +│ │ │ │ tokens, 7 lines) +│ │ │ ├── 📄 tests__more_languages__group4__tf_test.tf.json (67 +│ │ │ │ tokens, 11 lines) +│ │ │ ├── 📄 tests__more_languages__group5__ansible_test.yml.json (34 +│ │ │ │ tokens, 7 lines) +│ │ │ ├── 📄 tests__more_languages__group5__app-routing.module.ts.json +│ │ │ │ (157 tokens, 6 lines) +│ │ │ ├── 📄 tests__more_languages__group5__app.component.ts.json (104 +│ │ │ │ tokens, 10 lines) +│ │ │ ├── 📄 tests__more_languages__group5__app.component.spec.ts.json +│ │ │ │ (67 tokens, 9 lines) +│ │ │ ├── 📄 tests__more_languages__group5__app.module.ts.json (87 +│ │ │ │ tokens, 6 lines) +│ │ │ ├── 📄 tests__more_languages__group5__checkbox_test.md.json (146 +│ │ │ │ tokens, 22 lines) +│ │ │ ├── 📄 tests__more_languages__group5__checkbox_test.txt.json +│ │ │ │ (104 tokens, 12 lines) +│ │ │ ├── 📄 tests__more_languages__group5__environment.test.ts.json +│ │ │ │ (46 tokens, 9 lines) +│ │ │ ├── 📄 tests__more_languages__group5__hello_world.pyi.json (44 +│ │ │ │ tokens, 6 lines) +│ │ │ ├── 📄 tests__more_languages__group5__k8s_test.yaml.json (42 +│ │ │ │ tokens, 7 lines) +│ │ │ ├── 📄 tests__more_languages__group5__Makefile.json (62 tokens, +│ │ │ │ 13 lines) +│ │ │ ├── 📄 tests__more_languages__group5__requirements_test.txt.json +│ │ │ │ (59 tokens, 13 lines) +│ │ │ ├── 📄 tests__more_languages__group5__rust_todo_test.rs.json (75 +│ │ │ │ tokens, 12 lines) +│ │ │ ├── 📄 tests__more_languages__group5__sql_test.sql.json (190 +│ │ │ │ tokens, 26 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group5__standard-app-routing.module.t +│ │ │ │ s.json (106 tokens, 5 lines) +│ │ │ ├── 📄 tests__more_languages__group5__test.env.json (98 tokens, +│ │ │ │ 24 lines) +│ │ │ ├── 📄 tests__more_languages__group5__testJsonSchema.json.json +│ │ │ │ (59 tokens, 8 lines) +│ │ │ ├── 📄 tests__more_languages__group5__testPackage.json.json (74 +│ │ │ │ tokens, 12 lines) +│ │ │ ├── 📄 tests__more_languages__group5__tickets.component.ts.json +│ │ │ │ (647 tokens, 53 lines) +│ │ │ ├── 📄 tests__more_languages__group6__catastrophic.c.json (1,677 +│ │ │ │ tokens, 195 lines) +│ │ │ ├── 📄 tests__more_languages__group6__cpp_examples_impl.cc.json +│ │ │ │ (50 tokens, 6 lines) +│ │ │ ├── 📄 tests__more_languages__group6__cpp_examples_impl.cu.json +│ │ │ │ (43 tokens, 6 lines) +│ │ │ ├── 📄 tests__more_languages__group6__cpp_examples_impl.h.json +│ │ │ │ (41 tokens, 6 lines) +│ │ │ ├── 📄 tests__more_languages__group6__edge_case.hpp.json (18 +│ │ │ │ tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group6__fractal.thy.json (816 +│ │ │ │ tokens, 29 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group6__Microsoft.PowerShell_profile. +│ │ │ │ ps1.json (354 tokens, 34 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group6__python_complex_class.py.json +│ │ │ │ (30 tokens, 5 lines) +│ │ │ ├── 📄 tests__more_languages__group6__ramda__cloneRegExp.js.json +│ │ │ │ (33 tokens, 5 lines) +│ │ │ ├── 📄 tests__more_languages__group6__ramda_prop.js.json (317 +│ │ │ │ tokens, 8 lines) +│ │ │ ├── 📄 tests__more_languages__group6__tensorflow_flags.h.json +│ │ │ │ (2,584 tokens, 101 lines) +│ │ │ ├── 📄 tests__more_languages__group6__test.f.json (112 tokens, +│ │ │ │ 10 lines) +│ │ │ ├── 📄 tests__more_languages__group6__torch.rst.json (34 tokens, +│ │ │ │ 6 lines) +│ │ │ ├── 📄 tests__more_languages__group6__yc.html.json (17 tokens, 3 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__more_languages__group7__absurdly_huge.jsonl.json +│ │ │ │ (68 tokens, 13 lines) +│ │ │ ├── 📄 tests__more_languages__group7__angular_crud.ts.json (217 +│ │ │ │ tokens, 17 lines) +│ │ │ ├── 📄 tests__more_languages__group7__structure.py.json (338 +│ │ │ │ tokens, 34 lines) +│ │ │ ├── 📄 tests__more_languages__group7__test.wgsl.json (288 +│ │ │ │ tokens, 19 lines) +│ │ │ ├── 📄 tests__more_languages__group7__test.metal.json (135 +│ │ │ │ tokens, 10 lines) +│ │ │ ├── 📄 tests__more_languages__group_lisp__clojure_test.clj.json +│ │ │ │ (63 tokens, 12 lines) +│ │ │ ├── 📄 tests__more_languages__group_lisp__LispTest.lisp.json (29 +│ │ │ │ tokens, 6 lines) +│ │ │ ├── 📄 tests__more_languages__group_lisp__racket_struct.rkt.json +│ │ │ │ (25 tokens, 5 lines) +│ │ │ ├── 📄 tests__more_languages__group_lisp__test_scheme.scm.json +│ │ │ │ (54 tokens, 10 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__AAPLShaders.metal.json +│ │ │ │ (1,202 tokens, 29 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__crystal_test.cr.json +│ │ │ │ (20 tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__dart_test.dart.json +│ │ │ │ (20 tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__elixir_test.exs.json +│ │ │ │ (20 tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__forward.frag.json (19 +│ │ │ │ tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__forward.vert.json (19 +│ │ │ │ tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__nodemon.json.json (19 +│ │ │ │ tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__sas_test.sas.json (19 +│ │ │ │ tokens, 3 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group_todo__test_setup_py.test.json +│ │ │ │ (21 tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__testTypings.d.ts.json +│ │ │ │ (20 tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__vba_test.bas.json (19 +│ │ │ │ tokens, 3 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__wgsl_test.wgsl.json +│ │ │ │ (70 tokens, 7 lines) +│ │ │ ├── 📄 tests__path_to_test__class_method_type.py.json (340 +│ │ │ │ tokens, 32 lines) +│ │ │ ├── 📄 tests__path_to_test__empty.py.json (15 tokens, 3 lines) +│ │ │ ├── 📄 tests__path_to_test__file.md.json (20 tokens, 5 lines) +│ │ │ ├── 📄 tests__path_to_test__file.py.json (21 tokens, 5 lines) +│ │ │ ├── 📄 tests__path_to_test__file.txt.json (15 tokens, 3 lines) +│ │ │ └── 📄 tests__path_to_test__version.py.json (23 tokens, 5 lines) +│ │ ├── 📁 counts (1 folder, 109 files) +│ │ │ ├── 📄 tests__dot_dot__my_test_file.py.json (20 tokens, 0 lines) +│ │ │ ├── 📄 tests__dot_dot__nested_dir__.env.test.json (21 tokens, 0 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__dot_dot__nested_dir__pytest.ini.json (22 tokens, 0 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__dot_dot__nested_dir__test_tp_dotdot.py.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__addamt.cobol.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__CUSTOMER-INVOICE.CBL.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__JavaTest.java.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__JuliaTest.jl.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__KotlinTest.kt.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__lesson.cbl.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__LuaTest.lua.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__ObjectiveCTest.m.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__OcamlTest.ml.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group1__test.js.json (22 tokens, 0 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__more_languages__group1__test.ts.json (22 tokens, 0 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__more_languages__group2__apl_test.apl.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group2__c_test.c.json (23 tokens, +│ │ │ │ 0 lines) +│ │ │ ├── 📄 tests__more_languages__group2__go_test.go.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group2__PerlTest.pl.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group2__PhpTest.php.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group2__PowershellTest.ps1.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group2__ScalaTest.scala.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group2__test.csv.json (15 tokens, +│ │ │ │ 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__bash_test.sh.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__cpp_test.cpp.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__csharp_test.cs.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__hallucination.tex.json (25 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__ruby_test.rb.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__swift_test.swift.json (25 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.lean.json (23 tokens, +│ │ │ │ 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.capnp.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.graphql.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.proto.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test.sqlite.json (16 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test_Cargo.toml.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group3__test_json_rpc_2_0.json.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test_openapi.yaml.json (25 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test_openrpc.json.json (25 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group3__test_pyproject.toml.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group4__erl_test.erl.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group4__haskell_test.hs.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group4__mathematica_test.nb.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group4__matlab_test.m.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group4__RTest.R.json (22 tokens, 0 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__more_languages__group4__rust_test.rs.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group4__test.zig.json (22 tokens, +│ │ │ │ 0 lines) +│ │ │ ├── 📄 tests__more_languages__group4__test_fsharp.fs.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group4__test_tcl_tk.tcl.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group4__tf_test.tf.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__ansible_test.yml.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__app-routing.module.ts.json +│ │ │ │ (26 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__app.component.ts.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__app.component.spec.ts.json +│ │ │ │ (26 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__app.module.ts.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__checkbox_test.md.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__checkbox_test.txt.json (25 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__environment.test.ts.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__hello_world.pyi.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__k8s_test.yaml.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__Makefile.json (22 tokens, +│ │ │ │ 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__requirements_test.txt.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__rust_todo_test.rs.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__sql_test.sql.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group5__standard-app-routing.module.t +│ │ │ │ s.json (28 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__test.env.json (22 tokens, +│ │ │ │ 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__testJsonSchema.json.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__testPackage.json.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group5__tickets.component.ts.json +│ │ │ │ (26 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__catastrophic.c.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__cpp_examples_impl.cc.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__cpp_examples_impl.cu.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__cpp_examples_impl.h.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__edge_case.hpp.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__fractal.thy.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group6__Microsoft.PowerShell_profile. +│ │ │ │ ps1.json (29 tokens, 0 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group6__python_complex_class.py.json +│ │ │ │ (26 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__ramda__cloneRegExp.js.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__ramda_prop.js.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__tensorflow_flags.h.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__test.f.json (22 tokens, 0 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__more_languages__group6__torch.rst.json (22 tokens, +│ │ │ │ 0 lines) +│ │ │ ├── 📄 tests__more_languages__group6__yc.html.json (23 tokens, 0 +│ │ │ │ lines) +│ │ │ ├── 📄 tests__more_languages__group7__absurdly_huge.jsonl.json +│ │ │ │ (26 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group7__angular_crud.ts.json (25 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group7__structure.py.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group7__test.wgsl.json (23 tokens, +│ │ │ │ 0 lines) +│ │ │ ├── 📄 tests__more_languages__group7__test.metal.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_lisp__clojure_test.clj.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_lisp__LispTest.lisp.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_lisp__racket_struct.rkt.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_lisp__test_scheme.scm.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__AAPLShaders.metal.json +│ │ │ │ (26 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__crystal_test.cr.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__dart_test.dart.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__elixir_test.exs.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__forward.frag.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__forward.vert.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__nodemon.json.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__sas_test.sas.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 +│ │ │ │ tests__more_languages__group_todo__test_setup_py.test.json +│ │ │ │ (26 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__testTypings.d.ts.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__vba_test.bas.json (24 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__more_languages__group_todo__wgsl_test.wgsl.json +│ │ │ │ (25 tokens, 0 lines) +│ │ │ ├── 📄 tests__path_to_test__class_method_type.py.json (23 +│ │ │ │ tokens, 0 lines) +│ │ │ ├── 📄 tests__path_to_test__empty.py.json (19 tokens, 0 lines) +│ │ │ ├── 📄 tests__path_to_test__file.md.json (19 tokens, 0 lines) +│ │ │ ├── 📄 tests__path_to_test__file.py.json (19 tokens, 0 lines) +│ │ │ ├── 📄 tests__path_to_test__file.txt.json (20 tokens, 0 lines) +│ │ │ └── 📄 tests__path_to_test__version.py.json (20 tokens, 0 lines) +│ │ ├── 📁 trees (1 folder, 13 files) +│ │ │ ├── 📄 dot_dot.txt (99 tokens, 10 lines) +│ │ │ ├── 📄 more_languages.txt (22,977 tokens, 2,225 lines) +│ │ │ ├── 📄 more_languages_group1.txt (2,909 tokens, 374 lines) +│ │ │ ├── 📄 more_languages_group2.txt (1,111 tokens, 135 lines) +│ │ │ ├── 📄 more_languages_group3.txt (3,035 tokens, 346 lines) +│ │ │ ├── 📄 more_languages_group4.txt (1,737 tokens, 214 lines) +│ │ │ ├── 📄 more_languages_group5.txt (2,288 tokens, 257 lines) +│ │ │ ├── 📄 more_languages_group6.txt (6,838 tokens, 615 lines) +│ │ │ ├── 📄 more_languages_group7.txt (1,148 tokens, 126 lines) +│ │ │ ├── 📄 more_languages_group_lisp.txt (155 tokens, 22 lines) +│ │ │ ├── 📄 more_languages_group_todo.txt (1,491 tokens, 111 lines) +│ │ │ ├── 📄 multi_seed.txt (3,789 tokens, 427 lines) +│ │ │ └── 📄 path_to_test.txt (445 tokens, 51 lines) +│ │ └── 📁 trees_v1 (1 folder, 13 files) +│ │ ├── 📄 dot_dot.txt (99 tokens, 10 lines) +│ │ ├── 📄 more_languages.txt (10,782 tokens, 1,122 lines) +│ │ ├── 📄 more_languages_group1.txt (1,130 tokens, 146 lines) +│ │ ├── 📄 more_languages_group2.txt (415 tokens, 54 lines) +│ │ ├── 📄 more_languages_group3.txt (1,240 tokens, 149 lines) +│ │ ├── 📄 more_languages_group4.txt (1,020 tokens, 123 lines) +│ │ ├── 📄 more_languages_group5.txt (2,099 tokens, 235 lines) +│ │ ├── 📄 more_languages_group6.txt (2,821 tokens, 300 lines) +│ │ ├── 📄 more_languages_group7.txt (705 tokens, 83 lines) +│ │ ├── 📄 more_languages_group_lisp.txt (52 tokens, 5 lines) +│ │ ├── 📄 more_languages_group_todo.txt (140 tokens, 13 lines) +│ │ ├── 📄 multi_seed.txt (1,783 tokens, 199 lines) +│ │ └── 📄 path_to_test.txt (445 tokens, 51 lines) +│ ├── 📁 more_languages (10 folders, 99 files) +│ │ ├── 📁 group1 (1 folder, 11 files) +│ │ │ ├── 📄 addamt.cobol (441 tokens, 40 lines) +│ │ │ ├── 📄 CUSTOMER-INVOICE.CBL (412 tokens, 60 lines) +│ │ │ ├── 📄 JavaTest.java (578 tokens, 86 lines) +│ │ │ ├── 📄 JuliaTest.jl (381 tokens, 63 lines) +│ │ │ ├── 📄 KotlinTest.kt (974 tokens, 171 lines) +│ │ │ ├── 📄 lesson.cbl (635 tokens, 78 lines) +│ │ │ ├── 📄 LuaTest.lua (83 tokens, 16 lines) +│ │ │ ├── 📄 ObjectiveCTest.m (62 tokens, 16 lines) +│ │ │ ├── 📄 OcamlTest.ml (49 tokens, 12 lines) +│ │ │ ├── 📄 test.js (757 tokens, 154 lines) +│ │ │ └── 📄 test.ts (832 tokens, 165 lines) +│ │ ├── 📁 group2 (1 folder, 8 files) +│ │ │ ├── 📄 apl_test.apl (28 tokens, 5 lines) +│ │ │ ├── 📄 c_test.c (837 tokens, 142 lines) +│ │ │ ├── 📄 go_test.go (179 tokens, 46 lines) +│ │ │ ├── 📄 PerlTest.pl (63 tokens, 20 lines) +│ │ │ ├── 📄 PhpTest.php (70 tokens, 19 lines) +│ │ │ ├── 📄 PowershellTest.ps1 (459 tokens, 89 lines) +│ │ │ ├── 📄 ScalaTest.scala (171 tokens, 40 lines) +│ │ │ └── 📄 test.csv (0 tokens, 0 lines) +│ │ ├── 📁 group3 (1 folder, 16 files) +│ │ │ ├── 📄 bash_test.sh (127 tokens, 22 lines) +│ │ │ ├── 📄 cpp_test.cpp (1,670 tokens, 259 lines) +│ │ │ ├── 📄 csharp_test.cs (957 tokens, 146 lines) +│ │ │ ├── 📄 hallucination.tex (1,633 tokens, 126 lines) +│ │ │ ├── 📄 ruby_test.rb (138 tokens, 37 lines) +│ │ │ ├── 📄 swift_test.swift (469 tokens, 110 lines) +│ │ │ ├── 📄 test.lean (289 tokens, 42 lines) +│ │ │ ├── 📄 test.capnp (117 tokens, 30 lines) +│ │ │ ├── 📄 test.graphql (66 tokens, 21 lines) +│ │ │ ├── 📄 test.proto (142 tokens, 34 lines) +│ │ │ ├── 📄 test.sqlite (0 tokens, 0 lines) +│ │ │ ├── 📄 test_Cargo.toml (119 tokens, 18 lines) +│ │ │ ├── 📄 test_json_rpc_2_0.json (26 tokens, 6 lines) +│ │ │ ├── 📄 test_openapi.yaml (753 tokens, 92 lines) +│ │ │ ├── 📄 test_openrpc.json (225 tokens, 44 lines) +│ │ │ └── 📄 test_pyproject.toml (304 tokens, 39 lines) +│ │ ├── 📁 group4 (1 folder, 10 files) +│ │ │ ├── 📄 erl_test.erl (480 tokens, 68 lines) +│ │ │ ├── 📄 haskell_test.hs (414 tokens, 41 lines) +│ │ │ ├── 📄 mathematica_test.nb (133 tokens, 21 lines) +│ │ │ ├── 📄 matlab_test.m (48 tokens, 12 lines) +│ │ │ ├── 📄 RTest.R (367 tokens, 46 lines) +│ │ │ ├── 📄 rust_test.rs (1,368 tokens, 259 lines) +│ │ │ ├── 📄 test.zig (397 tokens, 60 lines) +│ │ │ ├── 📄 test_fsharp.fs (92 tokens, 27 lines) +│ │ │ ├── 📄 test_tcl_tk.tcl (54 tokens, 16 lines) +│ │ │ └── 📄 tf_test.tf (202 tokens, 38 lines) +│ │ ├── 📁 group5 (1 folder, 19 files) +│ │ │ ├── 📄 ansible_test.yml (55 tokens, 14 lines) +│ │ │ ├── 📄 app-routing.module.ts (287 tokens, 28 lines) +│ │ │ ├── 📄 app.component.spec.ts (410 tokens, 47 lines) +│ │ │ ├── 📄 app.component.ts (271 tokens, 45 lines) +│ │ │ ├── 📄 app.module.ts (374 tokens, 43 lines) +│ │ │ ├── 📄 checkbox_test.md (191 tokens, 29 lines) +│ │ │ ├── 📄 checkbox_test.txt (257 tokens, 33 lines) +│ │ │ ├── 📄 environment.test.ts (197 tokens, 19 lines) +│ │ │ ├── 📄 hello_world.pyi (22 tokens, 3 lines) +│ │ │ ├── 📄 k8s_test.yaml (140 tokens, 37 lines) +│ │ │ ├── 📄 Makefile (714 tokens, 84 lines) +│ │ │ ├── 📄 requirements_test.txt (29 tokens, 10 lines) +│ │ │ ├── 📄 rust_todo_test.rs (92 tokens, 26 lines) +│ │ │ ├── 📄 sql_test.sql (270 tokens, 51 lines) +│ │ │ ├── 📄 standard-app-routing.module.ts (100 tokens, 16 lines) +│ │ │ ├── 📄 test.env (190 tokens, 25 lines) +│ │ │ ├── 📄 testJsonSchema.json (421 tokens, 48 lines) +│ │ │ ├── 📄 testPackage.json (349 tokens, 43 lines) +│ │ │ └── 📄 tickets.component.ts (7,160 tokens, 903 lines) +│ │ ├── 📁 group6 (1 folder, 14 files) +│ │ │ ├── 📄 catastrophic.c (5,339 tokens, 754 lines) +│ │ │ ├── 📄 cpp_examples_impl.cc (60 tokens, 10 lines) +│ │ │ ├── 📄 cpp_examples_impl.cu (37 tokens, 10 lines) +│ │ │ ├── 📄 cpp_examples_impl.h (22 tokens, 6 lines) +│ │ │ ├── 📄 edge_case.hpp (426 tokens, 28 lines) +│ │ │ ├── 📄 fractal.thy (1,712 tokens, 147 lines) +│ │ │ ├── 📄 Microsoft.PowerShell_profile.ps1 (3,346 tokens, 497 lines) +│ │ │ ├── 📄 python_complex_class.py (10 tokens, 2 lines) +│ │ │ ├── 📄 ramda__cloneRegExp.js (173 tokens, 9 lines) +│ │ │ ├── 📄 ramda_prop.js (646 tokens, 85 lines) +│ │ │ ├── 📄 tensorflow_flags.h (7,628 tokens, 668 lines) +│ │ │ ├── 📄 test.f (181 tokens, 30 lines) +│ │ │ ├── 📄 torch.rst (60 tokens, 8 lines) +│ │ │ └── 📄 yc.html (9,063 tokens, 169 lines) +│ │ ├── 📁 group7 (1 folder, 5 files) +│ │ │ ├── 📄 absurdly_huge.jsonl (8,347 tokens, 126 lines) +│ │ │ ├── 📄 angular_crud.ts (1,192 tokens, 148 lines) +│ │ │ ├── 📄 structure.py (400 tokens, 92 lines) +│ │ │ ├── 📄 test.wgsl (528 tokens, 87 lines) +│ │ │ └── 📄 test.metal (272 tokens, 34 lines) +│ │ ├── 📁 group_lisp (1 folder, 4 files) +│ │ │ ├── 📄 clojure_test.clj (682 tokens, 85 lines) +│ │ │ ├── 📄 LispTest.lisp (25 tokens, 6 lines) +│ │ │ ├── 📄 racket_struct.rkt (14 tokens, 1 line) +│ │ │ └── 📄 test_scheme.scm (360 tokens, 44 lines) +│ │ └── 📁 group_todo (1 folder, 12 files) +│ │ ├── 📄 AAPLShaders.metal (5,780 tokens, 566 lines) +│ │ ├── 📄 crystal_test.cr (48 tokens, 15 lines) +│ │ ├── 📄 dart_test.dart (108 tokens, 24 lines) +│ │ ├── 📄 elixir_test.exs (39 tokens, 10 lines) +│ │ ├── 📄 forward.frag (739 tokens, 87 lines) +│ │ ├── 📄 forward.vert (359 tokens, 48 lines) +│ │ ├── 📄 nodemon.json (118 tokens, 20 lines) +│ │ ├── 📄 sas_test.sas (97 tokens, 22 lines) +│ │ ├── 📄 test_setup_py.test (133 tokens, 24 lines) +│ │ ├── 📄 testTypings.d.ts (158 tokens, 23 lines) +│ │ ├── 📄 vba_test.bas (67 tokens, 16 lines) +│ │ └── 📄 wgsl_test.wgsl (94 tokens, 17 lines) +│ ├── 📁 path_to_test (1 folder, 6 files) +│ │ ├── 📄 class_method_type.py (525 tokens, 101 lines) +│ │ ├── 📄 empty.py (0 tokens, 0 lines) +│ │ ├── 📄 file.md (11 tokens, 2 lines) +│ │ ├── 📄 file.py (18 tokens, 3 lines) +│ │ ├── 📄 file.txt (10 tokens, 2 lines) +│ │ └── 📄 version.py (13 tokens, 2 lines) +│ ├── 📄 pytest.ini (20 tokens, 4 lines) +│ ├── 📁 readme_updates (1 folder, 6 files) +│ │ ├── 📄 .gitkeep (0 tokens, 0 lines) +│ │ ├── 📄 medium_README_sink.md (37,636 tokens, 3,733 lines) +│ │ ├── 📄 medium_README_source.md (149 tokens, 48 lines) +│ │ ├── 📄 mini_README_sink.md (7,833 tokens, 785 lines) +│ │ ├── 📄 mini_README_source.md (27 tokens, 8 lines) +│ │ └── 📄 renamed_dry_run_README.md (38,153 tokens, 3,705 lines) +│ ├── 📄 tensorflow_expectation.py (2,554 tokens, 97 lines) +│ ├── 📄 test_cli.py (2,218 tokens, 252 lines) +│ ├── 📄 test_deploy.py (2,045 tokens, 225 lines) +│ ├── 📄 test_dotenv.py (45 tokens, 10 lines) +│ ├── 📄 test_e2e.py (3,630 tokens, 429 lines) +│ ├── 📄 test_engine.py (3,869 tokens, 424 lines) +│ ├── 📄 test_more_language_units.py (24,414 tokens, 2,559 lines) +│ ├── 📄 test_programs.py (509 tokens, 88 lines) +│ ├── 📄 test_units.py (1,914 tokens, 261 lines) +│ ├── 📄 test_web.py (42 tokens, 6 lines) +│ └── 📁 version_increments (1 folder, 3 files) +│ ├── 📄 renamed_dry_run_version.py (12 tokens, 1 line) +│ ├── 📄 test_sink_version.py (12 tokens, 1 line) +│ └── 📄 test_source_version.py (12 tokens, 2 lines) +├── 📄 tree_plus_cli.py (2,283 tokens, 331 lines) +├── 📁 tree_plus_programs (1 folder, 4 files) +│ ├── 📄 hello_tree_plus.py (545 tokens, 80 lines) +│ ├── 📄 rewrite.py (4,017 tokens, 471 lines) +│ ├── 📄 stub_tests.py (1,348 tokens, 180 lines) +│ └── 📄 test_stub_tests.py (79 tokens, 20 lines) +└── 📁 tree_plus_src (2 folders, 10 files) + ├── 📄 count_tokens_lines.py (1,323 tokens, 209 lines) + ├── 📄 debug.py (186 tokens, 39 lines) + ├── 📄 deploy.py (2,058 tokens, 230 lines) + ├── 📄 engine.py (12,042 tokens, 1,438 lines) + ├── 📄 ignore.py (2,342 tokens, 332 lines) + ├── 📄 isabelle_symbols.py (2,146 tokens, 462 lines) + ├── 📄 parse_file.py (27,040 tokens, 2,959 lines) + ├── 📁 scripts (1 folder, 1 file) + │ └── 📄 alias_tree_plus.sh (241 tokens, 30 lines) + ├── 📄 version.py (12 tokens, 1 line) + └── 📄 web.py (2,409 tokens, 321 lines) diff --git a/tests/golden/legacy/trees_v1/dot_dot.txt b/tests/golden/legacy/trees_v1/dot_dot.txt new file mode 100644 index 0000000..d6db06d --- /dev/null +++ b/tests/golden/legacy/trees_v1/dot_dot.txt @@ -0,0 +1,10 @@ +📁 dot_dot (2 folders, 4 files) +├── 📄 my_test_file.py (7 tokens, 2 lines) +│ └── def dot_dot_dot() +└── 📁 nested_dir (1 folder, 3 files) + ├── 📄 .env.test (4 tokens, 0 lines) + │ └── DEBUG_TREE_PLUS + ├── 📄 pytest.ini (20 tokens, 4 lines) + └── 📄 test_tp_dotdot.py (362 tokens, 52 lines) + ├── def ignore_tokens_lines_test(text: str) -> str + └── def test_tree_plus_dotdot() diff --git a/tests/golden/legacy/trees_v1/more_languages.txt b/tests/golden/legacy/trees_v1/more_languages.txt new file mode 100644 index 0000000..a1de0a9 --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages.txt @@ -0,0 +1,1122 @@ +📁 more_languages (10 folders, 99 files) +├── 📁 group1 (1 folder, 11 files) +│ ├── 📄 addamt.cobol (441 tokens, 40 lines) +│ ├── 📄 CUSTOMER-INVOICE.CBL (412 tokens, 60 lines) +│ ├── 📄 JavaTest.java (578 tokens, 86 lines) +│ ├── 📄 JuliaTest.jl (381 tokens, 63 lines) +│ ├── 📄 KotlinTest.kt (974 tokens, 171 lines) +│ ├── 📄 lesson.cbl (635 tokens, 78 lines) +│ ├── 📄 LuaTest.lua (83 tokens, 16 lines) +│ ├── 📄 ObjectiveCTest.m (62 tokens, 16 lines) +│ ├── 📄 OcamlTest.ml (49 tokens, 12 lines) +│ ├── 📄 test.js (757 tokens, 154 lines) +│ │ ├── class MyClass +│ │ ├── myMethod() +│ │ ├── async asyncMethod(a, b) +│ │ ├── methodWithDefaultParameters(a = 5, b = 10) +│ │ ├── multilineMethod( +│ │ │ c, +│ │ │ d +│ │ │ ) +│ │ ├── multilineMethodWithDefaults( +│ │ │ t = "tree", +│ │ │ p = "plus" +│ │ │ ) +│ │ ├── function myFunction(param1, param2) +│ │ ├── function multilineFunction( +│ │ │ param1, +│ │ │ param2 +│ │ │ ) +│ │ ├── const arrowFunction = () => +│ │ ├── const parametricArrow = (a, b) => +│ │ ├── function () +│ │ ├── function outerFunction(outerParam) +│ │ ├── function innerFunction(innerParam) +│ │ ├── innerFunction("inner") +│ │ ├── const myObject = { +│ │ ├── myMethod: function (stuff) +│ │ ├── let myArrowObject = { +│ │ ├── myArrow: ({ +│ │ │ a, +│ │ │ b, +│ │ │ c, +│ │ │ }) => +│ │ ├── const myAsyncArrowFunction = async () => +│ │ ├── function functionWithRestParameters(...args) +│ │ ├── const namedFunctionExpression = function myNamedFunction() +│ │ ├── const multilineArrowFunction = ( +│ │ │ a, +│ │ │ b +│ │ │ ) => +│ │ ├── function functionReturningFunction() +│ │ ├── return function () +│ │ ├── function destructuringOnMultipleLines({ +│ │ │ a, +│ │ │ b, +│ │ │ }) +│ │ ├── const arrowFunctionWithDestructuring = ({ a, b }) => +│ │ ├── const multilineDestructuringArrow = ({ +│ │ │ a, +│ │ │ b, +│ │ │ }) => +│ │ ├── async function asyncFunctionWithErrorHandling() +│ │ ├── class Car +│ │ ├── constructor(brand) +│ │ ├── present() +│ │ ├── class Model extends Car +│ │ ├── constructor(brand, mod) +│ │ ├── super(brand) +│ │ └── show() +│ └── 📄 test.ts (832 tokens, 165 lines) +│ ├── type MyType +│ ├── interface MyInterface +│ ├── class TsClass +│ ├── myMethod() +│ ├── myMethodWithArgs(param1: string, param2: number): void +│ ├── static myStaticMethod(param: T): T +│ ├── multilineMethod( +│ │ c: number, +│ │ d: number +│ │ ): number +│ ├── multilineMethodWithDefaults( +│ │ t: string = "tree", +│ │ p: string = "plus" +│ │ ): string +│ ├── export class AdvancedComponent implements MyInterface +│ ├── async myAsyncMethod( +│ │ a: string, +│ │ b: number, +│ │ c: string +│ │ ): Promise +│ ├── genericMethod( +│ │ arg1: T, +│ │ arg2: U +│ │ ): [T, U] +│ ├── export class TicketsComponent implements MyInterface +│ ├── async myAsyncMethod({ a, b, c }: { a: String; b: Number; c: String +│ │ }) +│ ├── function tsFunction() +│ ├── function tsFunctionSigned( +│ │ param1: number, +│ │ param2: number +│ │ ): void +│ ├── export default async function tsFunctionComplicated({ +│ │ a = 1 | 2, +│ │ b = "bob", +│ │ c = async () => "charlie", +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ c: () => Promise; +│ │ }): Promise +│ ├── const tsArrowFunctionSigned = ({ +│ │ a, +│ │ b, +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ }) => +│ ├── export const tsComplicatedArrow = async ({ +│ │ a = 1 | 2, +│ │ b = "bob", +│ │ c = async () => "charlie", +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ c: () => Promise; +│ │ }): Promise => +│ ├── const arrowFunction = () => +│ ├── const arrow = (a: String, b: Number) => +│ ├── const asyncArrowFunction = async () => +│ ├── const asyncArrow = async (a: String, b: Number) => +│ ├── let weirdArrow = () => +│ ├── const asyncPromiseArrow = async (): Promise => +│ ├── let myWeirdArrowSigned = (x: number): number => +│ ├── class Person +│ ├── constructor(private firstName: string, private lastName: string) +│ ├── getFullName(): string +│ ├── describe(): string +│ ├── class Employee extends Person +│ ├── constructor( +│ │ firstName: string, +│ │ lastName: string, +│ │ private jobTitle: string +│ │ ) +│ ├── super(firstName, lastName) +│ ├── describe(): string +│ ├── interface Shape +│ └── interface Square extends Shape +├── 📁 group2 (1 folder, 8 files) +│ ├── 📄 apl_test.apl (28 tokens, 5 lines) +│ ├── 📄 c_test.c (837 tokens, 142 lines) +│ │ ├── struct Point +│ │ ├── int x; +│ │ ├── int y; +│ │ ├── struct Point getOrigin() +│ │ ├── float mul_two_floats(float x1, float x2) +│ │ ├── enum days +│ │ ├── SUN, +│ │ ├── MON, +│ │ ├── TUE, +│ │ ├── WED, +│ │ ├── THU, +│ │ ├── FRI, +│ │ ├── SAT +│ │ ├── long add_two_longs(long x1, long x2) +│ │ ├── double multiplyByTwo(double num) +│ │ ├── char getFirstCharacter(char *str) +│ │ ├── void greet(Person p) +│ │ ├── typedef struct +│ │ ├── char name[50]; +│ │ ├── } Person; +│ │ ├── int main() +│ │ ├── int* getArrayStart(int arr[], int size) +│ │ ├── long complexFunctionWithMultipleArguments( +│ │ │ int param1, +│ │ │ double param2, +│ │ │ char *param3, +│ │ │ struct Point point +│ │ │ ) +│ │ ├── keyPattern *ACLKeyPatternCreate(sds pattern, int flags) +│ │ ├── sds sdsCatPatternString(sds base, keyPattern *pat) +│ │ ├── static int ACLCheckChannelAgainstList(list *reference, const char +│ │ │ *channel, int channellen, int is_pattern) +│ │ ├── while((ln = listNext(&li))) +│ │ ├── static struct config +│ │ ├── aeEventLoop *el; +│ │ ├── cliConnInfo conn_info; +│ │ ├── const char *hostsocket; +│ │ ├── int tls; +│ │ ├── struct cliSSLconfig sslconfig; +│ │ └── } config; +│ ├── 📄 go_test.go (179 tokens, 46 lines) +│ ├── 📄 PerlTest.pl (63 tokens, 20 lines) +│ ├── 📄 PhpTest.php (70 tokens, 19 lines) +│ ├── 📄 PowershellTest.ps1 (459 tokens, 89 lines) +│ ├── 📄 ScalaTest.scala (171 tokens, 40 lines) +│ └── 📄 test.csv (0 tokens, 0 lines) +│ ├── Name +│ ├── Age +│ ├── Country +│ ├── City +│ └── Email +├── 📁 group3 (1 folder, 16 files) +│ ├── 📄 bash_test.sh (127 tokens, 22 lines) +│ ├── 📄 cpp_test.cpp (1,670 tokens, 259 lines) +│ │ ├── class Person +│ │ ├── std::string name; +│ │ ├── public: +│ │ ├── Person(std::string n) : name(n) +│ │ ├── void greet() +│ │ ├── void globalGreet() +│ │ ├── int main() +│ │ ├── void printMessage(const std::string &message) +│ │ ├── template +│ │ │ void printVector(const std::vector& vec) +│ │ ├── struct Point +│ │ ├── int x, y; +│ │ ├── Point(int x, int y) : x(x), y(y) +│ │ ├── class Animal +│ │ ├── public: +│ │ ├── Animal(const std::string &name) : name(name) +│ │ ├── virtual void speak() const +│ │ ├── virtual ~Animal() +│ │ ├── protected: +│ │ ├── std::string name; +│ │ ├── class Dog : public Animal +│ │ ├── public: +│ │ ├── Dog(const std::string &name) : Animal(name) +│ │ ├── void speak() const override +│ │ ├── class Cat : public Animal +│ │ ├── public: +│ │ ├── Cat(const std::string &name) : Animal(name) +│ │ ├── void speak() const override +│ │ ├── nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int +│ │ │ num_layers, +│ │ │ int batch_size, int max_seq_length, +│ │ │ float dropout, +│ │ │ bool bidirectional, bool +│ │ │ cudnn_allow_tf32, +│ │ │ int workspace_size, int +│ │ │ reserve_space_size) +│ │ ├── int main() +│ │ ├── enum ECarTypes +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV, +│ │ ├── Wagon +│ │ ├── ECarTypes GetPreferredCarType() +│ │ ├── enum ECarTypes : uint8_t +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV = 254, +│ │ ├── Hybrid +│ │ ├── enum class ECarTypes : uint8_t +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV = 254, +│ │ ├── Hybrid +│ │ ├── void myFunction(string fname, int age) +│ │ ├── template T cos(T) +│ │ ├── template T sin(T) +│ │ ├── template T sqrt(T) +│ │ ├── template struct VLEN +│ │ ├── template class arr +│ │ ├── private: +│ │ ├── static T *ralloc(size_t num) +│ │ ├── static void dealloc(T *ptr) +│ │ ├── static T *ralloc(size_t num) +│ │ ├── static void dealloc(T *ptr) +│ │ ├── public: +│ │ ├── arr() : p(0), sz(0) +│ │ ├── arr(size_t n) : p(ralloc(n)), sz(n) +│ │ ├── arr(arr &&other) +│ │ │ : p(other.p), sz(other.sz) +│ │ ├── ~arr() +│ │ ├── void resize(size_t n) +│ │ ├── T &operator[](size_t idx) +│ │ ├── T *data() +│ │ ├── size_t size() const +│ │ ├── class Buffer +│ │ ├── private: +│ │ ├── void* ptr_; +│ │ └── std::tuple quantize( +│ │ const array& w, +│ │ int group_size, +│ │ int bits, +│ │ StreamOrDevice s) +│ ├── 📄 csharp_test.cs (957 tokens, 146 lines) +│ ├── 📄 hallucination.tex (1,633 tokens, 126 lines) +│ ├── 📄 ruby_test.rb (138 tokens, 37 lines) +│ ├── 📄 swift_test.swift (469 tokens, 110 lines) +│ ├── 📄 test.lean (289 tokens, 42 lines) +│ ├── 📄 test.capnp (117 tokens, 30 lines) +│ ├── 📄 test.graphql (66 tokens, 21 lines) +│ ├── 📄 test.proto (142 tokens, 34 lines) +│ ├── 📄 test.sqlite (0 tokens, 0 lines) +│ │ ├── students table: +│ │ ├── id integer primary key +│ │ ├── name text not null +│ │ ├── age integer not null +│ │ ├── courses table: +│ │ ├── id integer primary key +│ │ ├── title text not null +│ │ └── credits integer not null +│ ├── 📄 test_Cargo.toml (119 tokens, 18 lines) +│ │ ├── name: test_cargo +│ │ ├── version: 0.1.0 +│ │ ├── description: A test Cargo.toml +│ │ ├── license: MIT OR Apache-2.0 +│ │ ├── dependencies: +│ │ ├── clap 4.4 +│ │ └── sqlx 0.7 (features: runtime-tokio, tls-rustls) +│ ├── 📄 test_json_rpc_2_0.json (26 tokens, 6 lines) +│ │ ├── jsonrpc: 2.0 +│ │ ├── method: subtract +│ │ ├── params: +│ │ ├── minuend: 42 +│ │ ├── subtrahend: 23 +│ │ └── id: 1 +│ ├── 📄 test_openapi.yaml (753 tokens, 92 lines) +│ │ ├── openapi: 3.0.1 +│ │ ├── title: TODO Plugin +│ │ ├── description: A plugin to create and manage TODO lists using +│ │ │ ChatGPT. +│ │ ├── version: v1 +│ │ ├── servers: +│ │ ├── - url: PLUGIN_HOSTNAME +│ │ ├── paths: +│ │ ├── '/todos/{username}': +│ │ ├── GET (getTodos): Get the list of todos +│ │ ├── POST (addTodo): Add a todo to the list +│ │ └── DELETE (deleteTodo): Delete a todo from the list +│ ├── 📄 test_openrpc.json (225 tokens, 44 lines) +│ │ ├── openrpc: 1.2.1 +│ │ ├── info: +│ │ ├── title: Demo Petstore +│ │ ├── version: 1.0.0 +│ │ ├── methods: +│ │ ├── listPets: List all pets +│ │ ├── params: +│ │ ├── - limit: integer +│ │ └── result: pets = An array of pets +│ └── 📄 test_pyproject.toml (304 tokens, 39 lines) +│ ├── name: tree_plus +│ ├── version: 1.0.8 +│ ├── description: A `tree` util enhanced with tokens, lines, and +│ │ components. +│ ├── License :: OSI Approved :: Apache Software License +│ ├── License :: OSI Approved :: MIT License +│ ├── dependencies: +│ ├── tiktoken +│ ├── PyYAML +│ ├── click +│ ├── rich +│ └── tomli +├── 📁 group4 (1 folder, 10 files) +│ ├── 📄 erl_test.erl (480 tokens, 68 lines) +│ ├── 📄 haskell_test.hs (414 tokens, 41 lines) +│ ├── 📄 mathematica_test.nb (133 tokens, 21 lines) +│ ├── 📄 matlab_test.m (48 tokens, 12 lines) +│ ├── 📄 RTest.R (367 tokens, 46 lines) +│ ├── 📄 rust_test.rs (1,368 tokens, 259 lines) +│ │ ├── fn at_beginning<'a>(&'a str) +│ │ ├── pub enum Days { +│ │ │ #\[default] +│ │ │ Sun, +│ │ │ Mon, +│ │ │ #\[error("edge case {idx}, expected at least {} and at most {}", +│ │ │ .limits.lo, .limits.hi)] +│ │ │ Tue, +│ │ │ Wed, +│ │ │ Thu(i16, bool), +│ │ │ Fri { day: u8 }, +│ │ │ Sat { +│ │ │ urday: String, +│ │ │ edge_case: E, +│ │ │ }, +│ │ │ } +│ │ ├── struct Point +│ │ ├── impl Point +│ │ ├── fn get_origin() -> Point +│ │ ├── struct Person +│ │ ├── impl Person +│ │ ├── fn greet(&self) +│ │ ├── fn add_two_longs(x1: i64, x2: i64) -> i64 +│ │ ├── fn add_two_longs_longer( +│ │ │ x1: i64, +│ │ │ x2: i64, +│ │ │ ) -> i64 +│ │ ├── const fn multiply_by_two(num: f64) -> f64 +│ │ ├── fn get_first_character(s: &str) -> Option +│ │ ├── trait Drawable +│ │ ├── fn draw(&self) +│ │ ├── impl Drawable for Point +│ │ ├── fn draw(&self) +│ │ ├── fn with_generic(d: D) +│ │ ├── fn with_generic(d: D) +│ │ │ where +│ │ │ D: Drawable +│ │ ├── fn main() +│ │ ├── pub struct VisibleStruct +│ │ ├── mod my_module +│ │ ├── pub struct AlsoVisibleStruct(T, T) +│ │ ├── macro_rules! say_hello +│ │ ├── #[macro_export] +│ │ │ macro_rules! hello_tree_plus +│ │ ├── pub mod lib +│ │ ├── pub mod interfaces +│ │ ├── mod engine +│ │ ├── pub fn flow( +│ │ │ source: S1, +│ │ │ extractor: E, +│ │ │ inbox: S2, +│ │ │ transformer: T, +│ │ │ outbox: S3, +│ │ │ loader: L, +│ │ │ sink: &mut S4, +│ │ │ ) -> Result<(), Box> +│ │ │ where +│ │ │ S1: Extractable, +│ │ │ S2: Extractable + Loadable, +│ │ │ S3: Extractable + Loadable, +│ │ │ S4: Loadable, +│ │ │ E: Extractor, +│ │ │ T: Transformer, +│ │ │ L: Loader +│ │ ├── trait Container +│ │ ├── fn items(&self) -> impl Iterator +│ │ ├── trait HttpService +│ │ ├── async fn fetch(&self, url: Url) -> HtmlBody +│ │ ├── struct Pair +│ │ ├── trait Transformer +│ │ ├── fn transform(&self, input: T) -> T +│ │ ├── impl + Copy> Transformer for Pair +│ │ ├── fn transform(&self, input: T) -> T +│ │ ├── fn main() +│ │ ├── async fn handle_get(State(pool): State) -> +│ │ │ Result, (StatusCode, String)> +│ │ │ where +│ │ │ Bion: Cool +│ │ ├── #[macro_export] +│ │ │ macro_rules! unit +│ │ ├── fn insert( +│ │ │ &mut self, +│ │ │ key: (), +│ │ │ value: $unit_dtype, +│ │ │ ) -> Result, ETLError> +│ │ ├── pub async fn handle_get_axum_route( +│ │ │ Session { maybe_claims }: Session, +│ │ │ Path(RouteParams { +│ │ │ alpha, +│ │ │ bravo, +│ │ │ charlie, +│ │ │ edge_case +│ │ │ }): Path, +│ │ │ ) -> ServerResult +│ │ ├── fn encode_pipeline(cmds: &[Cmd], atomic: bool) -> Vec +│ │ ├── pub async fn handle_post_yeet( +│ │ │ State(auth_backend): State, +│ │ │ Session { maybe_claims }: Session, +│ │ │ Form(yeet_form): Form, +│ │ │ ) -> Result +│ │ └── pub async fn handle_get_thingy( +│ │ session: Session, +│ │ State(ApiBackend { +│ │ page_cache, +│ │ auth_backend, +│ │ library_sql, +│ │ some_data_cache, +│ │ metadata_cache, +│ │ thingy_client, +│ │ .. +│ │ }): State, +│ │ ) -> ServerResult +│ ├── 📄 test.zig (397 tokens, 60 lines) +│ ├── 📄 test_fsharp.fs (92 tokens, 27 lines) +│ ├── 📄 test_tcl_tk.tcl (54 tokens, 16 lines) +│ └── 📄 tf_test.tf (202 tokens, 38 lines) +├── 📁 group5 (1 folder, 19 files) +│ ├── 📄 ansible_test.yml (55 tokens, 14 lines) +│ │ ├── Install package +│ │ ├── Start service +│ │ └── Create user +│ ├── 📄 app-routing.module.ts (287 tokens, 28 lines) +│ │ ├── const routes: Routes = [ +│ │ │ { path: '', redirectTo: 'login', pathMatch: 'full' }, +│ │ │ { path: '*', redirectTo: 'login' }, +│ │ │ { path: 'home', component: HomeComponent }, +│ │ │ { path: 'login', component: LoginComponent }, +│ │ │ { path: 'register', component: RegisterComponent }, +│ │ │ { path: 'events', component: EventsComponent }, +│ │ │ { path: 'invites', component: InvitesComponent }, +│ │ │ { path: 'rewards', component: RewardsComponent }, +│ │ │ { path: 'profile', component: ProfileComponent }, +│ │ │ ]; +│ │ └── export class AppRoutingModule +│ ├── 📄 app.component.spec.ts (410 tokens, 47 lines) +│ │ ├── describe 'AppComponent' +│ │ ├── it should create the app +│ │ ├── it should welcome the user +│ │ ├── it should welcome 'Jimbo' +│ │ └── it should request login if not logged in +│ ├── 📄 app.component.ts (271 tokens, 45 lines) +│ │ ├── export class AppComponent +│ │ ├── constructor( +│ │ │ private http: HttpClient, +│ │ │ private loginService: LoginService, +│ │ │ private stripeService: StripeService +│ │ │ ) +│ │ ├── constructor(private loginService: LoginService) +│ │ ├── checkSession() +│ │ ├── async goToEvent(event_id: string) +│ │ └── valInvitedBy(event: any, event_id: string) +│ ├── 📄 app.module.ts (374 tokens, 43 lines) +│ │ ├── @NgModule({ +│ │ │ declarations: [ +│ │ │ AppComponent, +│ │ │ HomeComponent, +│ │ │ LoginComponent, +│ │ │ RegisterComponent, +│ │ │ EventsComponent, +│ │ │ InvitesComponent, +│ │ │ RewardsComponent, +│ │ │ ProfileComponent +│ │ └── export class AppModule +│ ├── 📄 checkbox_test.md (191 tokens, 29 lines) +│ │ ├── # My Checkbox Test +│ │ ├── ## My No Parens Test +│ │ ├── ## My Empty href Test +│ │ ├── ## My other url Test [Q&A] +│ │ ├── ## My other other url Test [Q&A] +│ │ ├── ## My 2nd other url Test [Q&A] +│ │ ├── ## My 3rd other url Test [Q&A] +│ │ ├── - [ ] Task 1 +│ │ ├── - [ ] No Space Task 1.1 +│ │ ├── - [ ] Two Spaces Task 1.2 +│ │ ├── - [ ] Subtask 1.2.1 +│ │ ├── - [ ] Task 2 +│ │ ├── - [x] Task 3 +│ │ ├── - [ ] Subtask 3.1 +│ │ ├── - [x] Task 6 +│ │ ├── - [x] Subtask 6.1 +│ │ ├── - [ ] Handle edge cases +│ │ └── # My Codeblock Test +│ ├── 📄 checkbox_test.txt (257 tokens, 33 lines) +│ │ ├── - [ ] fix phone number format +1 +│ │ ├── - [ ] add forgot password +│ │ ├── - [ ] ? add email verification +│ │ ├── - [ ] store token the right way +│ │ ├── - [ ] test nesting of checkboxes +│ │ ├── - [ ] user can use option to buy ticket at 2-referred price +│ │ ├── - [ ] CTA refer 2 people to get instant lower price +│ │ └── - [ ] form to send referrals +│ ├── 📄 environment.test.ts (197 tokens, 19 lines) +│ │ ├── environment: +│ │ ├── production +│ │ ├── cognitoUserPoolId +│ │ ├── cognitoAppClientId +│ │ └── apiurl +│ ├── 📄 hello_world.pyi (22 tokens, 3 lines) +│ │ ├── @final +│ │ │ class dtype(Generic[_DTypeScalar_co]) +│ │ └── names: None | tuple[builtins.str, ...] +│ ├── 📄 k8s_test.yaml (140 tokens, 37 lines) +│ │ ├── apps/v1.Deployment -> my-app +│ │ ├── v1.Service -> my-service +│ │ └── v1.ConfigMap -> my-config +│ ├── 📄 Makefile (714 tokens, 84 lines) +│ │ ├── include dotenv/dev.env +│ │ ├── .PHONY: dev +│ │ ├── dev +│ │ ├── services-down +│ │ ├── services-stop: services-down +│ │ ├── define CHECK_POSTGRES +│ │ ├── damage-report +│ │ ├── tail-logs +│ │ └── cloud +│ ├── 📄 requirements_test.txt (29 tokens, 10 lines) +│ │ ├── psycopg2-binary +│ │ ├── pytest +│ │ ├── coverage +│ │ ├── flask[async] +│ │ ├── flask_cors +│ │ ├── stripe +│ │ ├── pyjwt[crypto] +│ │ ├── cognitojwt[async] +│ │ └── flask-lambda +│ ├── 📄 rust_todo_test.rs (92 tokens, 26 lines) +│ │ ├── TODO: This todo tests parse_todo +│ │ ├── enum Color { +│ │ │ Red, +│ │ │ Blue, +│ │ │ Green, +│ │ │ } +│ │ ├── struct Point +│ │ ├── trait Drawable +│ │ ├── fn draw(&self) +│ │ ├── impl Drawable for Point +│ │ ├── fn draw(&self) +│ │ └── fn main() +│ ├── 📄 sql_test.sql (270 tokens, 51 lines) +│ ├── 📄 standard-app-routing.module.ts (100 tokens, 16 lines) +│ │ └── const routes: Routes = [ +│ │ { path: '', component: HomeComponent }, +│ │ { +│ │ path: 'heroes', +│ │ component: HeroesListComponent, +│ │ children: [ +│ │ { path: ':id', component: HeroDetailComponent }, +│ │ { path: 'new', component: HeroFormComponent }, +│ │ ], +│ │ }, +│ │ { path: '**', component: PageNotFoundComponent }, +│ │ ]; +│ ├── 📄 test.env (190 tokens, 25 lines) +│ │ ├── PROMO_PATH +│ │ ├── PRODUCTION +│ │ ├── SQL_SCHEMA_PATH +│ │ ├── DB_LOGS +│ │ ├── DB_LOG +│ │ ├── PGPASSWORD +│ │ ├── PGDATABASE +│ │ ├── PGHOST +│ │ ├── PGPORT +│ │ ├── PGUSER +│ │ ├── SERVER_LOG +│ │ ├── SERVER_LOGS +│ │ ├── API_URL +│ │ ├── APP_LOGS +│ │ ├── APP_LOG +│ │ ├── APP_URL +│ │ ├── COGNITO_USER_POOL_ID +│ │ ├── COGNITO_APP_CLIENT_ID +│ │ ├── AWS_REGION +│ │ └── STRIPE_SECRET_KEY +│ ├── 📄 testJsonSchema.json (421 tokens, 48 lines) +│ │ ├── $schema: http://json-schema.org/draft-07/schema# +│ │ ├── type: object +│ │ ├── title: random_test +│ │ └── description: A promoter's activites related to events +│ ├── 📄 testPackage.json (349 tokens, 43 lines) +│ │ ├── name: 'promo-app' +│ │ ├── version: 0.0.0 +│ │ ├── scripts: +│ │ ├── ng: 'ng' +│ │ ├── start: 'ng serve' +│ │ ├── build: 'ng build' +│ │ ├── watch: 'ng build --watch --configuration development' +│ │ └── test: 'ng test' +│ └── 📄 tickets.component.ts (7,160 tokens, 903 lines) +│ ├── interface EnrichedTicket extends Ticket +│ ├── interface SpinConfig +│ ├── interface RotationState +│ ├── interface SpeakInput +│ ├── const formatSpeakInput = (input: SpeakInput): string => +│ ├── function hourToSpeech(hour: number, minute: number, period: string): +│ │ string +│ ├── export class TicketsComponent implements AfterViewInit +│ ├── speak(input: SpeakInput) +│ ├── speakEvent(ticket: EnrichedTicket): void +│ ├── formatEvent(ticket: EnrichedTicket): string +│ ├── speakVenue(ticket: EnrichedTicket): void +│ ├── formatDate(date: Date, oneLiner: boolean = false): string +│ ├── formatDateForSpeech(date: Date): string +│ ├── async spinQRCode( +│ │ event: PointerEvent, +│ │ config: SpinConfig = DEFAULT_SPIN_CONFIG +│ │ ) +│ ├── private animateRotation( +│ │ imgElement: HTMLElement, +│ │ targetRotation: number, +│ │ config: SpinConfig, +│ │ cleanup: () => void +│ │ ) +│ ├── const animate = (currentTime: number) => +│ ├── requestAnimationFrame(animate) +│ ├── cleanup() +│ ├── requestAnimationFrame(animate) +│ ├── private getNext90Degree(currentRotation: number): number +│ ├── private getCurrentRotation(matrix: string): number +│ ├── ngAfterViewInit() +│ ├── const mouseEnterListener = () => +│ ├── const mouseLeaveListener = () => +│ ├── ngOnDestroy() +│ ├── toggleColumn(event: MatOptionSelectionChange, column: string) +│ ├── adjustColumns(event?: Event) +│ ├── onResize(event: Event) +│ ├── async ngOnInit() +│ ├── async loadTickets(): Promise +│ ├── onDateRangeChange( +│ │ type: "start" | "end", +│ │ event: MatDatepickerInputEvent +│ │ ) +│ ├── applyFilter(column: string): void +│ ├── formatDateForComparison(date: Date): string +│ ├── constructor(private renderer: Renderer2) +│ ├── onFilterChange(event: Event, column: string) +│ ├── onLatitudeChange(event: Event) +│ ├── onLongitudeChange(event: Event) +│ ├── onRadiusChange(event: Event) +│ ├── sortData(sort: Sort): void +│ ├── onRowClick(event: Event, row: any) +│ ├── function isDate(value: Date | undefined | null): value is Date +│ ├── function isNonNullNumber(value: number | null): value is number +│ ├── function hasLocation( +│ │ ticket: any +│ │ ): ticket is +│ ├── const create_faker_ticket = async () => +│ ├── function compare(a: number | string, b: number | string, isAsc: +│ │ boolean) +│ ├── function compare_dates(a: Date, b: Date, isAsc: boolean) +│ ├── async function mockMoreTickets(): Promise +│ ├── const mockTickets = async () => +│ └── const renderQRCode = async (text: String): Promise => +├── 📁 group6 (1 folder, 14 files) +│ ├── 📄 catastrophic.c (5,339 tokens, 754 lines) +│ │ ├── TODO: technically we should use a proper parser +│ │ ├── struct Point +│ │ ├── int x; +│ │ ├── int y; +│ │ ├── struct Point getOrigin() +│ │ ├── float mul_two_floats(float x1, float x2) +│ │ ├── enum days +│ │ ├── SUN, +│ │ ├── MON, +│ │ ├── TUE, +│ │ ├── WED, +│ │ ├── THU, +│ │ ├── FRI, +│ │ ├── SAT +│ │ ├── enum worker_pool_flags +│ │ ├── POOL_BH = 1 << 0, +│ │ ├── POOL_MANAGER_ACTIVE = 1 << 1, +│ │ ├── POOL_DISASSOCIATED = 1 << 2, +│ │ ├── POOL_BH_DRAINING = 1 << 3, +│ │ ├── enum worker_flags +│ │ ├── WORKER_DIE = 1 << 1, +│ │ ├── WORKER_IDLE = 1 << 2, +│ │ ├── WORKER_PREP = 1 << 3, +│ │ ├── WORKER_CPU_INTENSIVE = 1 << 6, +│ │ ├── WORKER_UNBOUND = 1 << 7, +│ │ ├── WORKER_REBOUND = 1 << 8, +│ │ ├── WORKER_NOT_RUNNING = WORKER_PREP | WORKER_CPU_INTENSIVE +│ │ │ | +│ │ │ WORKER_UNBOUND | WORKER_REBOUND, +│ │ ├── struct worker_pool +│ │ ├── raw_spinlock_t lock; +│ │ ├── int cpu; +│ │ ├── int node; +│ │ ├── int id; +│ │ ├── unsigned int flags; +│ │ ├── unsigned long watchdog_ts; +│ │ ├── bool cpu_stall; +│ │ ├── int nr_running; +│ │ ├── struct list_head worklist; +│ │ ├── int nr_workers; +│ │ ├── int nr_idle; +│ │ ├── struct list_head idle_list; +│ │ ├── struct timer_list idle_timer; +│ │ ├── struct work_struct idle_cull_work; +│ │ ├── struct timer_list mayday_timer; +│ │ ├── struct worker *manager; +│ │ ├── struct list_head workers; +│ │ ├── struct ida worker_ida; +│ │ ├── struct workqueue_attrs *attrs; +│ │ ├── struct hlist_node hash_node; +│ │ ├── int refcnt; +│ │ ├── struct rcu_head rcu; +│ │ ├── long add_two_longs(long x1, long x2) +│ │ ├── double multiplyByTwo(double num) +│ │ ├── char getFirstCharacter(char *str) +│ │ ├── void greet(Person p) +│ │ ├── typedef struct +│ │ ├── char name[50]; +│ │ ├── } Person; +│ │ ├── typedef struct PersonA +│ │ ├── char name[50]; +│ │ ├── } PersonB; +│ │ ├── int main() +│ │ ├── int* getArrayStart(int arr[], int size) +│ │ ├── long complexFunctionWithMultipleArguments( +│ │ │ int param1, +│ │ │ double param2, +│ │ │ char *param3, +│ │ │ struct Point point +│ │ │ ) +│ │ ├── keyPattern *ACLKeyPatternCreate(sds pattern, int flags) +│ │ ├── sds sdsCatPatternString(sds base, keyPattern *pat) +│ │ ├── static int ACLCheckChannelAgainstList(list *reference, const char +│ │ │ *channel, int channellen, int is_pattern) +│ │ ├── while((ln = listNext(&li))) +│ │ ├── static struct config +│ │ ├── aeEventLoop *el; +│ │ ├── cliConnInfo conn_info; +│ │ ├── const char *hostsocket; +│ │ ├── int tls; +│ │ ├── struct cliSSLconfig sslconfig; +│ │ ├── } config; +│ │ ├── class Person +│ │ ├── std::string name; +│ │ ├── public: +│ │ ├── Person(std::string n) : name(n) +│ │ ├── void greet() +│ │ ├── void globalGreet() +│ │ ├── int main() +│ │ ├── void printMessage(const std::string &message) +│ │ ├── template +│ │ │ void printVector(const std::vector& vec) +│ │ ├── struct foo +│ │ ├── char x; +│ │ ├── struct foo_in +│ │ ├── char* y; +│ │ ├── short z; +│ │ ├── } inner; +│ │ ├── struct Point +│ │ ├── int x, y; +│ │ ├── Point(int x, int y) : x(x), y(y) +│ │ ├── class Animal +│ │ ├── public: +│ │ ├── Animal(const std::string &name) : name(name) +│ │ ├── virtual void speak() const +│ │ ├── virtual ~Animal() +│ │ ├── protected: +│ │ ├── std::string name; +│ │ ├── class Dog : public Animal +│ │ ├── public: +│ │ ├── Dog(const std::string &name) : Animal(name) +│ │ ├── void speak() const override +│ │ ├── class Cat : public Animal +│ │ ├── public: +│ │ ├── Cat(const std::string &name) : Animal(name) +│ │ ├── void speak() const override +│ │ ├── class CatDog: public Animal, public Cat, public Dog +│ │ ├── public: +│ │ ├── CatDog(const std::string &name) : Animal(name) +│ │ ├── int meow_bark() +│ │ ├── nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int +│ │ │ num_layers, +│ │ │ int batch_size, int max_seq_length, +│ │ │ float dropout, +│ │ │ bool bidirectional, bool +│ │ │ cudnn_allow_tf32, +│ │ │ int workspace_size, int +│ │ │ reserve_space_size) +│ │ ├── int main() +│ │ ├── enum ECarTypes +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV, +│ │ ├── Wagon +│ │ ├── ECarTypes GetPreferredCarType() +│ │ ├── enum ECarTypes : uint8_t +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV = 254, +│ │ ├── Hybrid +│ │ ├── enum class ECarTypes : uint8_t +│ │ ├── Sedan, +│ │ ├── Hatchback, +│ │ ├── SUV = 254, +│ │ ├── Hybrid +│ │ ├── void myFunction(string fname, int age) +│ │ ├── template T cos(T) +│ │ ├── template T sin(T) +│ │ ├── template T sqrt(T) +│ │ ├── template struct VLEN +│ │ ├── template class arr +│ │ ├── private: +│ │ ├── static T *ralloc(size_t num) +│ │ ├── static void dealloc(T *ptr) +│ │ ├── static T *ralloc(size_t num) +│ │ ├── static void dealloc(T *ptr) +│ │ ├── public: +│ │ ├── arr() : p(0), sz(0) +│ │ ├── arr(size_t n) : p(ralloc(n)), sz(n) +│ │ ├── arr(arr &&other) +│ │ │ : p(other.p), sz(other.sz) +│ │ ├── ~arr() +│ │ ├── void resize(size_t n) +│ │ ├── T &operator[](size_t idx) +│ │ ├── T *data() +│ │ ├── size_t size() const +│ │ ├── class Buffer +│ │ ├── private: +│ │ ├── void* ptr_; +│ │ ├── std::tuple quantize( +│ │ │ const array& w, +│ │ │ int group_size, +│ │ │ int bits, +│ │ │ StreamOrDevice s) +│ │ ├── #define PY_SSIZE_T_CLEAN +│ │ ├── #define PLATFORM_IS_X86 +│ │ ├── #define PLATFORM_WINDOWS +│ │ ├── #define GETCPUID(a, b, c, d, a_inp, c_inp) +│ │ ├── static int GetXCR0EAX() +│ │ ├── #define GETCPUID(a, b, c, d, a_inp, c_inp) +│ │ ├── static int GetXCR0EAX() +│ │ ├── asm("XGETBV" : "=a"(eax), "=d"(edx) : "c"(0)) +│ │ ├── static void ReportMissingCpuFeature(const char* name) +│ │ ├── static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args) +│ │ ├── static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args) +│ │ ├── static PyMethodDef cpu_feature_guard_methods[] +│ │ ├── static struct PyModuleDef cpu_feature_guard_module +│ │ ├── #define EXPORT_SYMBOL __declspec(dllexport) +│ │ ├── #define EXPORT_SYMBOL __attribute__ ((visibility("default"))) +│ │ ├── EXPORT_SYMBOL PyMODINIT_FUNC PyInit_cpu_feature_guard(void) +│ │ ├── typedef struct +│ │ ├── GPT2Config config; +│ │ ├── ParameterTensors params; +│ │ ├── size_t param_sizes[NUM_PARAMETER_TENSORS]; +│ │ ├── float* params_memory; +│ │ ├── size_t num_parameters; +│ │ ├── ParameterTensors grads; +│ │ ├── float* grads_memory; +│ │ ├── float* m_memory; +│ │ ├── float* v_memory; +│ │ ├── ActivationTensors acts; +│ │ ├── size_t act_sizes[NUM_ACTIVATION_TENSORS]; +│ │ ├── float* acts_memory; +│ │ ├── size_t num_activations; +│ │ ├── ActivationTensors grads_acts; +│ │ ├── float* grads_acts_memory; +│ │ ├── int batch_size; +│ │ ├── int seq_len; +│ │ ├── int* inputs; +│ │ ├── int* targets; +│ │ ├── float mean_loss; +│ │ └── } GPT2; +│ ├── 📄 cpp_examples_impl.cc (60 tokens, 10 lines) +│ │ ├── PYBIND11_MODULE(cpp_examples, m) +│ │ └── m.def("add", &add, "An example function to add two +│ │ numbers.") +│ ├── 📄 cpp_examples_impl.cu (37 tokens, 10 lines) +│ │ ├── template +│ │ │ T add(T a, T b) +│ │ └── template <> +│ │ int add(int a, int b) +│ ├── 📄 cpp_examples_impl.h (22 tokens, 6 lines) +│ │ ├── template +│ │ │ T add(T a, T b) +│ │ └── template <> +│ │ int add(int, int) +│ ├── 📄 edge_case.hpp (426 tokens, 28 lines) +│ ├── 📄 fractal.thy (1,712 tokens, 147 lines) +│ ├── 📄 Microsoft.PowerShell_profile.ps1 (3,346 tokens, 497 lines) +│ ├── 📄 python_complex_class.py (10 tokens, 2 lines) +│ │ └── class Box(Space[NDArray[Any]]) +│ ├── 📄 ramda__cloneRegExp.js (173 tokens, 9 lines) +│ │ └── export default function _cloneRegExp(pattern) +│ ├── 📄 ramda_prop.js (646 tokens, 85 lines) +│ │ ├── /** +│ │ │ * Returns a function that when supplied an object returns the +│ │ │ indicated +│ │ │ * property of that object, if it exists. +│ │ │ * @category Object +│ │ │ * @typedefn Idx = String | Int | Symbol +│ │ │ * @sig Idx -> {s: a} -> a | Undefined +│ │ │ * @param {String|Number} p The property name or array index +│ │ │ * @param {Object} obj The object to query +│ │ │ * @return {*} The value at `obj.p`. +│ │ │ */ +│ │ │ var prop = _curry2(function prop(p, obj) +│ │ ├── /** +│ │ │ * Solves equations of the form a * x = b +│ │ │ * @param {{ +│ │ │ * z: number +│ │ │ * }} x +│ │ │ */ +│ │ │ function foo(x) +│ │ ├── /** +│ │ │ * Deconstructs an array field from the input documents to output a +│ │ │ document for each element. +│ │ │ * Each output document is the input document with the value of the +│ │ │ array field replaced by the element. +│ │ │ * @category Object +│ │ │ * @sig String -> {k: [v]} -> [{k: v}] +│ │ │ * @param {String} key The key to determine which property of the +│ │ │ object should be unwound. +│ │ │ * @param {Object} object The object containing the list to unwind +│ │ │ at the property named by the key. +│ │ │ * @return {List} A list of new objects, each having the given key +│ │ │ associated to an item from the unwound list. +│ │ │ */ +│ │ │ var unwind = _curry2(function(key, object) +│ │ └── return _map(function(item) +│ ├── 📄 tensorflow_flags.h (7,628 tokens, 668 lines) +│ │ ├── #define TENSORFLOW_CORE_CONFIG_FLAG_DEFS_H_ +│ │ ├── class Flags +│ │ ├── public: +│ │ ├── bool SetterForXlaAutoJitFlag(const string& value) +│ │ ├── bool SetterForXlaCallModuleDisabledChecks(const string& value) +│ │ ├── void AppendMarkForCompilationPassFlagsInternal(std::vector* +│ │ │ flag_list) +│ │ ├── void AllocateAndParseJitRtFlags() +│ │ ├── void AllocateAndParseFlags() +│ │ ├── void ResetFlags() +│ │ ├── bool SetXlaAutoJitFlagFromFlagString(const string& value) +│ │ ├── BuildXlaOpsPassFlags* GetBuildXlaOpsPassFlags() +│ │ ├── MarkForCompilationPassFlags* GetMarkForCompilationPassFlags() +│ │ ├── XlaSparseCoreFlags* GetXlaSparseCoreFlags() +│ │ ├── XlaDeviceFlags* GetXlaDeviceFlags() +│ │ ├── XlaOpsCommonFlags* GetXlaOpsCommonFlags() +│ │ ├── XlaCallModuleFlags* GetXlaCallModuleFlags() +│ │ ├── MlirCommonFlags* GetMlirCommonFlags() +│ │ ├── void ResetJitCompilerFlags() +│ │ ├── const JitRtFlags& GetJitRtFlags() +│ │ ├── ConfigProto::Experimental::MlirBridgeRollout +│ │ │ GetMlirBridgeRolloutState( +│ │ │ std::optional config_proto) +│ │ ├── void AppendMarkForCompilationPassFlags(std::vector* flag_list) +│ │ ├── void DisableXlaCompilation() +│ │ ├── void EnableXlaCompilation() +│ │ ├── bool FailOnXlaCompilation() +│ │ ├── #define TF_PY_DECLARE_FLAG(flag_name) +│ │ └── PYBIND11_MODULE(flags_pybind, m) +│ ├── 📄 test.f (181 tokens, 30 lines) +│ ├── 📄 torch.rst (60 tokens, 8 lines) +│ │ ├── # libtorch (C++-only) +│ │ └── - Building libtorch using Python +│ └── 📄 yc.html (9,063 tokens, 169 lines) +├── 📁 group7 (1 folder, 5 files) +│ ├── 📄 absurdly_huge.jsonl (8,347 tokens, 126 lines) +│ │ ├── SMILES: str +│ │ ├── Yield: float +│ │ ├── Temperature: int +│ │ ├── Pressure: float +│ │ ├── Solvent: str +│ │ ├── Success: bool +│ │ ├── Reaction_Conditions: dict +│ │ ├── Products: list +│ │ └── EdgeCasesMissed: None +│ ├── 📄 angular_crud.ts (1,192 tokens, 148 lines) +│ │ ├── interface DBCommand +│ │ ├── export class IndexedDbService +│ │ ├── constructor() +│ │ ├── async create_connection({ db_name = 'client_db', table_name }: +│ │ │ DBCommand) +│ │ ├── upgrade(db) +│ │ ├── async create_model({ db_name, table_name, model }: DBCommand) +│ │ ├── verify_matching({ table_name, model }) +│ │ ├── async read_key({ db_name, table_name, key }: DBCommand) +│ │ ├── async update_model({ db_name, table_name, model }: DBCommand) +│ │ ├── verify_matching({ table_name, model }) +│ │ ├── async delete_key({ db_name, table_name, key }: DBCommand) +│ │ ├── async list_table({ +│ │ │ db_name, +│ │ │ table_name, +│ │ │ where, +│ │ │ }: DBCommand & { where?: { [key: string]: string | number } }) +│ │ └── async search_table(criteria: SearchCriteria) +│ ├── 📄 structure.py (400 tokens, 92 lines) +│ │ ├── @runtime_checkable +│ │ │ class DataClass(Protocol) +│ │ ├── __dataclass_fields__: dict +│ │ ├── class MyInteger(Enum) +│ │ ├── ONE = 1 +│ │ ├── TWO = 2 +│ │ ├── THREE = 42 +│ │ ├── class MyString(Enum) +│ │ ├── AAA1 = "aaa" +│ │ ├── BB_B = """edge +│ │ │ case""" +│ │ ├── @dataclass(frozen=True, slots=True, kw_only=True) +│ │ │ class Tool +│ │ ├── name: str +│ │ ├── description: str +│ │ ├── input_model: DataClass +│ │ ├── output_model: DataClass +│ │ ├── def execute(self, *args, **kwargs) +│ │ ├── @property +│ │ │ def edge_case(self) -> str +│ │ ├── def should_still_see_me(self, x: bool = True) -> "Tool" +│ │ ├── @dataclass +│ │ │ class MyInput[T] +│ │ ├── name: str +│ │ ├── rank: MyInteger +│ │ ├── serial_n: int +│ │ ├── @dataclass +│ │ │ class Thingy +│ │ ├── is_edge_case: bool +│ │ ├── @dataclass +│ │ │ class MyOutput +│ │ ├── orders: str +│ │ ├── class MyTools(Enum) +│ │ ├── TOOL_A = Tool( +│ │ │ name="complicated", +│ │ │ description="edge case!", +│ │ │ input_model=MyInput[Thingy], +│ │ │ output_model=MyOutput, +│ │ │ ) +│ │ ├── TOOL_B = Tool( +│ │ │ name="""super +│ │ │ complicated +│ │ │ """, +│ │ │ description="edge case!", +│ │ │ input_model=MyInput, +│ │ │ output_model=MyOutput, +│ │ │ ) +│ │ ├── @final +│ │ │ class dtype(Generic[_DTypeScalar_co]) +│ │ └── names: None | tuple[builtins.str, ...] +│ ├── 📄 test.wgsl (528 tokens, 87 lines) +│ └── 📄 test.metal (272 tokens, 34 lines) +├── 📁 group_lisp (1 folder, 4 files) +│ ├── 📄 clojure_test.clj (682 tokens, 85 lines) +│ ├── 📄 LispTest.lisp (25 tokens, 6 lines) +│ ├── 📄 racket_struct.rkt (14 tokens, 1 line) +│ └── 📄 test_scheme.scm (360 tokens, 44 lines) +└── 📁 group_todo (1 folder, 12 files) + ├── 📄 AAPLShaders.metal (5,780 tokens, 566 lines) + ├── 📄 crystal_test.cr (48 tokens, 15 lines) + ├── 📄 dart_test.dart (108 tokens, 24 lines) + ├── 📄 elixir_test.exs (39 tokens, 10 lines) + ├── 📄 forward.frag (739 tokens, 87 lines) + ├── 📄 forward.vert (359 tokens, 48 lines) + ├── 📄 nodemon.json (118 tokens, 20 lines) + ├── 📄 sas_test.sas (97 tokens, 22 lines) + ├── 📄 test_setup_py.test (133 tokens, 24 lines) + ├── 📄 testTypings.d.ts (158 tokens, 23 lines) + ├── 📄 vba_test.bas (67 tokens, 16 lines) + └── 📄 wgsl_test.wgsl (94 tokens, 17 lines) diff --git a/tests/golden/legacy/trees_v1/more_languages_group1.txt b/tests/golden/legacy/trees_v1/more_languages_group1.txt new file mode 100644 index 0000000..1d32827 --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages_group1.txt @@ -0,0 +1,146 @@ +📁 group1 (1 folder, 11 files) +├── 📄 addamt.cobol (441 tokens, 40 lines) +├── 📄 CUSTOMER-INVOICE.CBL (412 tokens, 60 lines) +├── 📄 JavaTest.java (578 tokens, 86 lines) +├── 📄 JuliaTest.jl (381 tokens, 63 lines) +├── 📄 KotlinTest.kt (974 tokens, 171 lines) +├── 📄 lesson.cbl (635 tokens, 78 lines) +├── 📄 LuaTest.lua (83 tokens, 16 lines) +├── 📄 ObjectiveCTest.m (62 tokens, 16 lines) +├── 📄 OcamlTest.ml (49 tokens, 12 lines) +├── 📄 test.js (757 tokens, 154 lines) +│ ├── class MyClass +│ ├── myMethod() +│ ├── async asyncMethod(a, b) +│ ├── methodWithDefaultParameters(a = 5, b = 10) +│ ├── multilineMethod( +│ │ c, +│ │ d +│ │ ) +│ ├── multilineMethodWithDefaults( +│ │ t = "tree", +│ │ p = "plus" +│ │ ) +│ ├── function myFunction(param1, param2) +│ ├── function multilineFunction( +│ │ param1, +│ │ param2 +│ │ ) +│ ├── const arrowFunction = () => +│ ├── const parametricArrow = (a, b) => +│ ├── function () +│ ├── function outerFunction(outerParam) +│ ├── function innerFunction(innerParam) +│ ├── innerFunction("inner") +│ ├── const myObject = { +│ ├── myMethod: function (stuff) +│ ├── let myArrowObject = { +│ ├── myArrow: ({ +│ │ a, +│ │ b, +│ │ c, +│ │ }) => +│ ├── const myAsyncArrowFunction = async () => +│ ├── function functionWithRestParameters(...args) +│ ├── const namedFunctionExpression = function myNamedFunction() +│ ├── const multilineArrowFunction = ( +│ │ a, +│ │ b +│ │ ) => +│ ├── function functionReturningFunction() +│ ├── return function () +│ ├── function destructuringOnMultipleLines({ +│ │ a, +│ │ b, +│ │ }) +│ ├── const arrowFunctionWithDestructuring = ({ a, b }) => +│ ├── const multilineDestructuringArrow = ({ +│ │ a, +│ │ b, +│ │ }) => +│ ├── async function asyncFunctionWithErrorHandling() +│ ├── class Car +│ ├── constructor(brand) +│ ├── present() +│ ├── class Model extends Car +│ ├── constructor(brand, mod) +│ ├── super(brand) +│ └── show() +└── 📄 test.ts (832 tokens, 165 lines) + ├── type MyType + ├── interface MyInterface + ├── class TsClass + ├── myMethod() + ├── myMethodWithArgs(param1: string, param2: number): void + ├── static myStaticMethod(param: T): T + ├── multilineMethod( + │ c: number, + │ d: number + │ ): number + ├── multilineMethodWithDefaults( + │ t: string = "tree", + │ p: string = "plus" + │ ): string + ├── export class AdvancedComponent implements MyInterface + ├── async myAsyncMethod( + │ a: string, + │ b: number, + │ c: string + │ ): Promise + ├── genericMethod( + │ arg1: T, + │ arg2: U + │ ): [T, U] + ├── export class TicketsComponent implements MyInterface + ├── async myAsyncMethod({ a, b, c }: { a: String; b: Number; c: String }) + ├── function tsFunction() + ├── function tsFunctionSigned( + │ param1: number, + │ param2: number + │ ): void + ├── export default async function tsFunctionComplicated({ + │ a = 1 | 2, + │ b = "bob", + │ c = async () => "charlie", + │ }: { + │ a: number; + │ b: string; + │ c: () => Promise; + │ }): Promise + ├── const tsArrowFunctionSigned = ({ + │ a, + │ b, + │ }: { + │ a: number; + │ b: string; + │ }) => + ├── export const tsComplicatedArrow = async ({ + │ a = 1 | 2, + │ b = "bob", + │ c = async () => "charlie", + │ }: { + │ a: number; + │ b: string; + │ c: () => Promise; + │ }): Promise => + ├── const arrowFunction = () => + ├── const arrow = (a: String, b: Number) => + ├── const asyncArrowFunction = async () => + ├── const asyncArrow = async (a: String, b: Number) => + ├── let weirdArrow = () => + ├── const asyncPromiseArrow = async (): Promise => + ├── let myWeirdArrowSigned = (x: number): number => + ├── class Person + ├── constructor(private firstName: string, private lastName: string) + ├── getFullName(): string + ├── describe(): string + ├── class Employee extends Person + ├── constructor( + │ firstName: string, + │ lastName: string, + │ private jobTitle: string + │ ) + ├── super(firstName, lastName) + ├── describe(): string + ├── interface Shape + └── interface Square extends Shape diff --git a/tests/golden/legacy/trees_v1/more_languages_group2.txt b/tests/golden/legacy/trees_v1/more_languages_group2.txt new file mode 100644 index 0000000..854ea5d --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages_group2.txt @@ -0,0 +1,54 @@ +📁 group2 (1 folder, 8 files) +├── 📄 apl_test.apl (28 tokens, 5 lines) +├── 📄 c_test.c (837 tokens, 142 lines) +│ ├── struct Point +│ ├── int x; +│ ├── int y; +│ ├── struct Point getOrigin() +│ ├── float mul_two_floats(float x1, float x2) +│ ├── enum days +│ ├── SUN, +│ ├── MON, +│ ├── TUE, +│ ├── WED, +│ ├── THU, +│ ├── FRI, +│ ├── SAT +│ ├── long add_two_longs(long x1, long x2) +│ ├── double multiplyByTwo(double num) +│ ├── char getFirstCharacter(char *str) +│ ├── void greet(Person p) +│ ├── typedef struct +│ ├── char name[50]; +│ ├── } Person; +│ ├── int main() +│ ├── int* getArrayStart(int arr[], int size) +│ ├── long complexFunctionWithMultipleArguments( +│ │ int param1, +│ │ double param2, +│ │ char *param3, +│ │ struct Point point +│ │ ) +│ ├── keyPattern *ACLKeyPatternCreate(sds pattern, int flags) +│ ├── sds sdsCatPatternString(sds base, keyPattern *pat) +│ ├── static int ACLCheckChannelAgainstList(list *reference, const char +│ │ *channel, int channellen, int is_pattern) +│ ├── while((ln = listNext(&li))) +│ ├── static struct config +│ ├── aeEventLoop *el; +│ ├── cliConnInfo conn_info; +│ ├── const char *hostsocket; +│ ├── int tls; +│ ├── struct cliSSLconfig sslconfig; +│ └── } config; +├── 📄 go_test.go (179 tokens, 46 lines) +├── 📄 PerlTest.pl (63 tokens, 20 lines) +├── 📄 PhpTest.php (70 tokens, 19 lines) +├── 📄 PowershellTest.ps1 (459 tokens, 89 lines) +├── 📄 ScalaTest.scala (171 tokens, 40 lines) +└── 📄 test.csv (0 tokens, 0 lines) + ├── Name + ├── Age + ├── Country + ├── City + └── Email diff --git a/tests/golden/legacy/trees_v1/more_languages_group3.txt b/tests/golden/legacy/trees_v1/more_languages_group3.txt new file mode 100644 index 0000000..1b4769e --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages_group3.txt @@ -0,0 +1,149 @@ +📁 group3 (1 folder, 16 files) +├── 📄 bash_test.sh (127 tokens, 22 lines) +├── 📄 cpp_test.cpp (1,670 tokens, 259 lines) +│ ├── class Person +│ ├── std::string name; +│ ├── public: +│ ├── Person(std::string n) : name(n) +│ ├── void greet() +│ ├── void globalGreet() +│ ├── int main() +│ ├── void printMessage(const std::string &message) +│ ├── template +│ │ void printVector(const std::vector& vec) +│ ├── struct Point +│ ├── int x, y; +│ ├── Point(int x, int y) : x(x), y(y) +│ ├── class Animal +│ ├── public: +│ ├── Animal(const std::string &name) : name(name) +│ ├── virtual void speak() const +│ ├── virtual ~Animal() +│ ├── protected: +│ ├── std::string name; +│ ├── class Dog : public Animal +│ ├── public: +│ ├── Dog(const std::string &name) : Animal(name) +│ ├── void speak() const override +│ ├── class Cat : public Animal +│ ├── public: +│ ├── Cat(const std::string &name) : Animal(name) +│ ├── void speak() const override +│ ├── nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int +│ │ num_layers, +│ │ int batch_size, int max_seq_length, float +│ │ dropout, +│ │ bool bidirectional, bool cudnn_allow_tf32, +│ │ int workspace_size, int reserve_space_size) +│ ├── int main() +│ ├── enum ECarTypes +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV, +│ ├── Wagon +│ ├── ECarTypes GetPreferredCarType() +│ ├── enum ECarTypes : uint8_t +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV = 254, +│ ├── Hybrid +│ ├── enum class ECarTypes : uint8_t +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV = 254, +│ ├── Hybrid +│ ├── void myFunction(string fname, int age) +│ ├── template T cos(T) +│ ├── template T sin(T) +│ ├── template T sqrt(T) +│ ├── template struct VLEN +│ ├── template class arr +│ ├── private: +│ ├── static T *ralloc(size_t num) +│ ├── static void dealloc(T *ptr) +│ ├── static T *ralloc(size_t num) +│ ├── static void dealloc(T *ptr) +│ ├── public: +│ ├── arr() : p(0), sz(0) +│ ├── arr(size_t n) : p(ralloc(n)), sz(n) +│ ├── arr(arr &&other) +│ │ : p(other.p), sz(other.sz) +│ ├── ~arr() +│ ├── void resize(size_t n) +│ ├── T &operator[](size_t idx) +│ ├── T *data() +│ ├── size_t size() const +│ ├── class Buffer +│ ├── private: +│ ├── void* ptr_; +│ └── std::tuple quantize( +│ const array& w, +│ int group_size, +│ int bits, +│ StreamOrDevice s) +├── 📄 csharp_test.cs (957 tokens, 146 lines) +├── 📄 hallucination.tex (1,633 tokens, 126 lines) +├── 📄 ruby_test.rb (138 tokens, 37 lines) +├── 📄 swift_test.swift (469 tokens, 110 lines) +├── 📄 test.lean (289 tokens, 42 lines) +├── 📄 test.capnp (117 tokens, 30 lines) +├── 📄 test.graphql (66 tokens, 21 lines) +├── 📄 test.proto (142 tokens, 34 lines) +├── 📄 test.sqlite (0 tokens, 0 lines) +│ ├── students table: +│ ├── id integer primary key +│ ├── name text not null +│ ├── age integer not null +│ ├── courses table: +│ ├── id integer primary key +│ ├── title text not null +│ └── credits integer not null +├── 📄 test_Cargo.toml (119 tokens, 18 lines) +│ ├── name: test_cargo +│ ├── version: 0.1.0 +│ ├── description: A test Cargo.toml +│ ├── license: MIT OR Apache-2.0 +│ ├── dependencies: +│ ├── clap 4.4 +│ └── sqlx 0.7 (features: runtime-tokio, tls-rustls) +├── 📄 test_json_rpc_2_0.json (26 tokens, 6 lines) +│ ├── jsonrpc: 2.0 +│ ├── method: subtract +│ ├── params: +│ ├── minuend: 42 +│ ├── subtrahend: 23 +│ └── id: 1 +├── 📄 test_openapi.yaml (753 tokens, 92 lines) +│ ├── openapi: 3.0.1 +│ ├── title: TODO Plugin +│ ├── description: A plugin to create and manage TODO lists using ChatGPT. +│ ├── version: v1 +│ ├── servers: +│ ├── - url: PLUGIN_HOSTNAME +│ ├── paths: +│ ├── '/todos/{username}': +│ ├── GET (getTodos): Get the list of todos +│ ├── POST (addTodo): Add a todo to the list +│ └── DELETE (deleteTodo): Delete a todo from the list +├── 📄 test_openrpc.json (225 tokens, 44 lines) +│ ├── openrpc: 1.2.1 +│ ├── info: +│ ├── title: Demo Petstore +│ ├── version: 1.0.0 +│ ├── methods: +│ ├── listPets: List all pets +│ ├── params: +│ ├── - limit: integer +│ └── result: pets = An array of pets +└── 📄 test_pyproject.toml (304 tokens, 39 lines) + ├── name: tree_plus + ├── version: 1.0.8 + ├── description: A `tree` util enhanced with tokens, lines, and components. + ├── License :: OSI Approved :: Apache Software License + ├── License :: OSI Approved :: MIT License + ├── dependencies: + ├── tiktoken + ├── PyYAML + ├── click + ├── rich + └── tomli diff --git a/tests/golden/legacy/trees_v1/more_languages_group4.txt b/tests/golden/legacy/trees_v1/more_languages_group4.txt new file mode 100644 index 0000000..4a4641f --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages_group4.txt @@ -0,0 +1,123 @@ +📁 group4 (1 folder, 10 files) +├── 📄 erl_test.erl (480 tokens, 68 lines) +├── 📄 haskell_test.hs (414 tokens, 41 lines) +├── 📄 mathematica_test.nb (133 tokens, 21 lines) +├── 📄 matlab_test.m (48 tokens, 12 lines) +├── 📄 RTest.R (367 tokens, 46 lines) +├── 📄 rust_test.rs (1,368 tokens, 259 lines) +│ ├── fn at_beginning<'a>(&'a str) +│ ├── pub enum Days { +│ │ #\[default] +│ │ Sun, +│ │ Mon, +│ │ #\[error("edge case {idx}, expected at least {} and at most {}", +│ │ .limits.lo, .limits.hi)] +│ │ Tue, +│ │ Wed, +│ │ Thu(i16, bool), +│ │ Fri { day: u8 }, +│ │ Sat { +│ │ urday: String, +│ │ edge_case: E, +│ │ }, +│ │ } +│ ├── struct Point +│ ├── impl Point +│ ├── fn get_origin() -> Point +│ ├── struct Person +│ ├── impl Person +│ ├── fn greet(&self) +│ ├── fn add_two_longs(x1: i64, x2: i64) -> i64 +│ ├── fn add_two_longs_longer( +│ │ x1: i64, +│ │ x2: i64, +│ │ ) -> i64 +│ ├── const fn multiply_by_two(num: f64) -> f64 +│ ├── fn get_first_character(s: &str) -> Option +│ ├── trait Drawable +│ ├── fn draw(&self) +│ ├── impl Drawable for Point +│ ├── fn draw(&self) +│ ├── fn with_generic(d: D) +│ ├── fn with_generic(d: D) +│ │ where +│ │ D: Drawable +│ ├── fn main() +│ ├── pub struct VisibleStruct +│ ├── mod my_module +│ ├── pub struct AlsoVisibleStruct(T, T) +│ ├── macro_rules! say_hello +│ ├── #[macro_export] +│ │ macro_rules! hello_tree_plus +│ ├── pub mod lib +│ ├── pub mod interfaces +│ ├── mod engine +│ ├── pub fn flow( +│ │ source: S1, +│ │ extractor: E, +│ │ inbox: S2, +│ │ transformer: T, +│ │ outbox: S3, +│ │ loader: L, +│ │ sink: &mut S4, +│ │ ) -> Result<(), Box> +│ │ where +│ │ S1: Extractable, +│ │ S2: Extractable + Loadable, +│ │ S3: Extractable + Loadable, +│ │ S4: Loadable, +│ │ E: Extractor, +│ │ T: Transformer, +│ │ L: Loader +│ ├── trait Container +│ ├── fn items(&self) -> impl Iterator +│ ├── trait HttpService +│ ├── async fn fetch(&self, url: Url) -> HtmlBody +│ ├── struct Pair +│ ├── trait Transformer +│ ├── fn transform(&self, input: T) -> T +│ ├── impl + Copy> Transformer for Pair +│ ├── fn transform(&self, input: T) -> T +│ ├── fn main() +│ ├── async fn handle_get(State(pool): State) -> Result, +│ │ (StatusCode, String)> +│ │ where +│ │ Bion: Cool +│ ├── #[macro_export] +│ │ macro_rules! unit +│ ├── fn insert( +│ │ &mut self, +│ │ key: (), +│ │ value: $unit_dtype, +│ │ ) -> Result, ETLError> +│ ├── pub async fn handle_get_axum_route( +│ │ Session { maybe_claims }: Session, +│ │ Path(RouteParams { +│ │ alpha, +│ │ bravo, +│ │ charlie, +│ │ edge_case +│ │ }): Path, +│ │ ) -> ServerResult +│ ├── fn encode_pipeline(cmds: &[Cmd], atomic: bool) -> Vec +│ ├── pub async fn handle_post_yeet( +│ │ State(auth_backend): State, +│ │ Session { maybe_claims }: Session, +│ │ Form(yeet_form): Form, +│ │ ) -> Result +│ └── pub async fn handle_get_thingy( +│ session: Session, +│ State(ApiBackend { +│ page_cache, +│ auth_backend, +│ library_sql, +│ some_data_cache, +│ metadata_cache, +│ thingy_client, +│ .. +│ }): State, +│ ) -> ServerResult +├── 📄 test.zig (397 tokens, 60 lines) +├── 📄 test_fsharp.fs (92 tokens, 27 lines) +├── 📄 test_tcl_tk.tcl (54 tokens, 16 lines) +└── 📄 tf_test.tf (202 tokens, 38 lines) diff --git a/tests/golden/legacy/trees_v1/more_languages_group5.txt b/tests/golden/legacy/trees_v1/more_languages_group5.txt new file mode 100644 index 0000000..47b0a19 --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages_group5.txt @@ -0,0 +1,235 @@ +📁 group5 (1 folder, 19 files) +├── 📄 ansible_test.yml (55 tokens, 14 lines) +│ ├── Install package +│ ├── Start service +│ └── Create user +├── 📄 app-routing.module.ts (287 tokens, 28 lines) +│ ├── const routes: Routes = [ +│ │ { path: '', redirectTo: 'login', pathMatch: 'full' }, +│ │ { path: '*', redirectTo: 'login' }, +│ │ { path: 'home', component: HomeComponent }, +│ │ { path: 'login', component: LoginComponent }, +│ │ { path: 'register', component: RegisterComponent }, +│ │ { path: 'events', component: EventsComponent }, +│ │ { path: 'invites', component: InvitesComponent }, +│ │ { path: 'rewards', component: RewardsComponent }, +│ │ { path: 'profile', component: ProfileComponent }, +│ │ ]; +│ └── export class AppRoutingModule +├── 📄 app.component.spec.ts (410 tokens, 47 lines) +│ ├── describe 'AppComponent' +│ ├── it should create the app +│ ├── it should welcome the user +│ ├── it should welcome 'Jimbo' +│ └── it should request login if not logged in +├── 📄 app.component.ts (271 tokens, 45 lines) +│ ├── export class AppComponent +│ ├── constructor( +│ │ private http: HttpClient, +│ │ private loginService: LoginService, +│ │ private stripeService: StripeService +│ │ ) +│ ├── constructor(private loginService: LoginService) +│ ├── checkSession() +│ ├── async goToEvent(event_id: string) +│ └── valInvitedBy(event: any, event_id: string) +├── 📄 app.module.ts (374 tokens, 43 lines) +│ ├── @NgModule({ +│ │ declarations: [ +│ │ AppComponent, +│ │ HomeComponent, +│ │ LoginComponent, +│ │ RegisterComponent, +│ │ EventsComponent, +│ │ InvitesComponent, +│ │ RewardsComponent, +│ │ ProfileComponent +│ └── export class AppModule +├── 📄 checkbox_test.md (191 tokens, 29 lines) +│ ├── # My Checkbox Test +│ ├── ## My No Parens Test +│ ├── ## My Empty href Test +│ ├── ## My other url Test [Q&A] +│ ├── ## My other other url Test [Q&A] +│ ├── ## My 2nd other url Test [Q&A] +│ ├── ## My 3rd other url Test [Q&A] +│ ├── - [ ] Task 1 +│ ├── - [ ] No Space Task 1.1 +│ ├── - [ ] Two Spaces Task 1.2 +│ ├── - [ ] Subtask 1.2.1 +│ ├── - [ ] Task 2 +│ ├── - [x] Task 3 +│ ├── - [ ] Subtask 3.1 +│ ├── - [x] Task 6 +│ ├── - [x] Subtask 6.1 +│ ├── - [ ] Handle edge cases +│ └── # My Codeblock Test +├── 📄 checkbox_test.txt (257 tokens, 33 lines) +│ ├── - [ ] fix phone number format +1 +│ ├── - [ ] add forgot password +│ ├── - [ ] ? add email verification +│ ├── - [ ] store token the right way +│ ├── - [ ] test nesting of checkboxes +│ ├── - [ ] user can use option to buy ticket at 2-referred price +│ ├── - [ ] CTA refer 2 people to get instant lower price +│ └── - [ ] form to send referrals +├── 📄 environment.test.ts (197 tokens, 19 lines) +│ ├── environment: +│ ├── production +│ ├── cognitoUserPoolId +│ ├── cognitoAppClientId +│ └── apiurl +├── 📄 hello_world.pyi (22 tokens, 3 lines) +│ ├── @final +│ │ class dtype(Generic[_DTypeScalar_co]) +│ └── names: None | tuple[builtins.str, ...] +├── 📄 k8s_test.yaml (140 tokens, 37 lines) +│ ├── apps/v1.Deployment -> my-app +│ ├── v1.Service -> my-service +│ └── v1.ConfigMap -> my-config +├── 📄 Makefile (714 tokens, 84 lines) +│ ├── include dotenv/dev.env +│ ├── .PHONY: dev +│ ├── dev +│ ├── services-down +│ ├── services-stop: services-down +│ ├── define CHECK_POSTGRES +│ ├── damage-report +│ ├── tail-logs +│ └── cloud +├── 📄 requirements_test.txt (29 tokens, 10 lines) +│ ├── psycopg2-binary +│ ├── pytest +│ ├── coverage +│ ├── flask[async] +│ ├── flask_cors +│ ├── stripe +│ ├── pyjwt[crypto] +│ ├── cognitojwt[async] +│ └── flask-lambda +├── 📄 rust_todo_test.rs (92 tokens, 26 lines) +│ ├── TODO: This todo tests parse_todo +│ ├── enum Color { +│ │ Red, +│ │ Blue, +│ │ Green, +│ │ } +│ ├── struct Point +│ ├── trait Drawable +│ ├── fn draw(&self) +│ ├── impl Drawable for Point +│ ├── fn draw(&self) +│ └── fn main() +├── 📄 sql_test.sql (270 tokens, 51 lines) +├── 📄 standard-app-routing.module.ts (100 tokens, 16 lines) +│ └── const routes: Routes = [ +│ { path: '', component: HomeComponent }, +│ { +│ path: 'heroes', +│ component: HeroesListComponent, +│ children: [ +│ { path: ':id', component: HeroDetailComponent }, +│ { path: 'new', component: HeroFormComponent }, +│ ], +│ }, +│ { path: '**', component: PageNotFoundComponent }, +│ ]; +├── 📄 test.env (190 tokens, 25 lines) +│ ├── PROMO_PATH +│ ├── PRODUCTION +│ ├── SQL_SCHEMA_PATH +│ ├── DB_LOGS +│ ├── DB_LOG +│ ├── PGPASSWORD +│ ├── PGDATABASE +│ ├── PGHOST +│ ├── PGPORT +│ ├── PGUSER +│ ├── SERVER_LOG +│ ├── SERVER_LOGS +│ ├── API_URL +│ ├── APP_LOGS +│ ├── APP_LOG +│ ├── APP_URL +│ ├── COGNITO_USER_POOL_ID +│ ├── COGNITO_APP_CLIENT_ID +│ ├── AWS_REGION +│ └── STRIPE_SECRET_KEY +├── 📄 testJsonSchema.json (421 tokens, 48 lines) +│ ├── $schema: http://json-schema.org/draft-07/schema# +│ ├── type: object +│ ├── title: random_test +│ └── description: A promoter's activites related to events +├── 📄 testPackage.json (349 tokens, 43 lines) +│ ├── name: 'promo-app' +│ ├── version: 0.0.0 +│ ├── scripts: +│ ├── ng: 'ng' +│ ├── start: 'ng serve' +│ ├── build: 'ng build' +│ ├── watch: 'ng build --watch --configuration development' +│ └── test: 'ng test' +└── 📄 tickets.component.ts (7,160 tokens, 903 lines) + ├── interface EnrichedTicket extends Ticket + ├── interface SpinConfig + ├── interface RotationState + ├── interface SpeakInput + ├── const formatSpeakInput = (input: SpeakInput): string => + ├── function hourToSpeech(hour: number, minute: number, period: string): + │ string + ├── export class TicketsComponent implements AfterViewInit + ├── speak(input: SpeakInput) + ├── speakEvent(ticket: EnrichedTicket): void + ├── formatEvent(ticket: EnrichedTicket): string + ├── speakVenue(ticket: EnrichedTicket): void + ├── formatDate(date: Date, oneLiner: boolean = false): string + ├── formatDateForSpeech(date: Date): string + ├── async spinQRCode( + │ event: PointerEvent, + │ config: SpinConfig = DEFAULT_SPIN_CONFIG + │ ) + ├── private animateRotation( + │ imgElement: HTMLElement, + │ targetRotation: number, + │ config: SpinConfig, + │ cleanup: () => void + │ ) + ├── const animate = (currentTime: number) => + ├── requestAnimationFrame(animate) + ├── cleanup() + ├── requestAnimationFrame(animate) + ├── private getNext90Degree(currentRotation: number): number + ├── private getCurrentRotation(matrix: string): number + ├── ngAfterViewInit() + ├── const mouseEnterListener = () => + ├── const mouseLeaveListener = () => + ├── ngOnDestroy() + ├── toggleColumn(event: MatOptionSelectionChange, column: string) + ├── adjustColumns(event?: Event) + ├── onResize(event: Event) + ├── async ngOnInit() + ├── async loadTickets(): Promise + ├── onDateRangeChange( + │ type: "start" | "end", + │ event: MatDatepickerInputEvent + │ ) + ├── applyFilter(column: string): void + ├── formatDateForComparison(date: Date): string + ├── constructor(private renderer: Renderer2) + ├── onFilterChange(event: Event, column: string) + ├── onLatitudeChange(event: Event) + ├── onLongitudeChange(event: Event) + ├── onRadiusChange(event: Event) + ├── sortData(sort: Sort): void + ├── onRowClick(event: Event, row: any) + ├── function isDate(value: Date | undefined | null): value is Date + ├── function isNonNullNumber(value: number | null): value is number + ├── function hasLocation( + │ ticket: any + │ ): ticket is + ├── const create_faker_ticket = async () => + ├── function compare(a: number | string, b: number | string, isAsc: boolean) + ├── function compare_dates(a: Date, b: Date, isAsc: boolean) + ├── async function mockMoreTickets(): Promise + ├── const mockTickets = async () => + └── const renderQRCode = async (text: String): Promise => diff --git a/tests/golden/legacy/trees_v1/more_languages_group6.txt b/tests/golden/legacy/trees_v1/more_languages_group6.txt new file mode 100644 index 0000000..6aa597b --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages_group6.txt @@ -0,0 +1,300 @@ +📁 group6 (1 folder, 14 files) +├── 📄 catastrophic.c (5,339 tokens, 754 lines) +│ ├── TODO: technically we should use a proper parser +│ ├── struct Point +│ ├── int x; +│ ├── int y; +│ ├── struct Point getOrigin() +│ ├── float mul_two_floats(float x1, float x2) +│ ├── enum days +│ ├── SUN, +│ ├── MON, +│ ├── TUE, +│ ├── WED, +│ ├── THU, +│ ├── FRI, +│ ├── SAT +│ ├── enum worker_pool_flags +│ ├── POOL_BH = 1 << 0, +│ ├── POOL_MANAGER_ACTIVE = 1 << 1, +│ ├── POOL_DISASSOCIATED = 1 << 2, +│ ├── POOL_BH_DRAINING = 1 << 3, +│ ├── enum worker_flags +│ ├── WORKER_DIE = 1 << 1, +│ ├── WORKER_IDLE = 1 << 2, +│ ├── WORKER_PREP = 1 << 3, +│ ├── WORKER_CPU_INTENSIVE = 1 << 6, +│ ├── WORKER_UNBOUND = 1 << 7, +│ ├── WORKER_REBOUND = 1 << 8, +│ ├── WORKER_NOT_RUNNING = WORKER_PREP | WORKER_CPU_INTENSIVE | +│ │ WORKER_UNBOUND | WORKER_REBOUND, +│ ├── struct worker_pool +│ ├── raw_spinlock_t lock; +│ ├── int cpu; +│ ├── int node; +│ ├── int id; +│ ├── unsigned int flags; +│ ├── unsigned long watchdog_ts; +│ ├── bool cpu_stall; +│ ├── int nr_running; +│ ├── struct list_head worklist; +│ ├── int nr_workers; +│ ├── int nr_idle; +│ ├── struct list_head idle_list; +│ ├── struct timer_list idle_timer; +│ ├── struct work_struct idle_cull_work; +│ ├── struct timer_list mayday_timer; +│ ├── struct worker *manager; +│ ├── struct list_head workers; +│ ├── struct ida worker_ida; +│ ├── struct workqueue_attrs *attrs; +│ ├── struct hlist_node hash_node; +│ ├── int refcnt; +│ ├── struct rcu_head rcu; +│ ├── long add_two_longs(long x1, long x2) +│ ├── double multiplyByTwo(double num) +│ ├── char getFirstCharacter(char *str) +│ ├── void greet(Person p) +│ ├── typedef struct +│ ├── char name[50]; +│ ├── } Person; +│ ├── typedef struct PersonA +│ ├── char name[50]; +│ ├── } PersonB; +│ ├── int main() +│ ├── int* getArrayStart(int arr[], int size) +│ ├── long complexFunctionWithMultipleArguments( +│ │ int param1, +│ │ double param2, +│ │ char *param3, +│ │ struct Point point +│ │ ) +│ ├── keyPattern *ACLKeyPatternCreate(sds pattern, int flags) +│ ├── sds sdsCatPatternString(sds base, keyPattern *pat) +│ ├── static int ACLCheckChannelAgainstList(list *reference, const char +│ │ *channel, int channellen, int is_pattern) +│ ├── while((ln = listNext(&li))) +│ ├── static struct config +│ ├── aeEventLoop *el; +│ ├── cliConnInfo conn_info; +│ ├── const char *hostsocket; +│ ├── int tls; +│ ├── struct cliSSLconfig sslconfig; +│ ├── } config; +│ ├── class Person +│ ├── std::string name; +│ ├── public: +│ ├── Person(std::string n) : name(n) +│ ├── void greet() +│ ├── void globalGreet() +│ ├── int main() +│ ├── void printMessage(const std::string &message) +│ ├── template +│ │ void printVector(const std::vector& vec) +│ ├── struct foo +│ ├── char x; +│ ├── struct foo_in +│ ├── char* y; +│ ├── short z; +│ ├── } inner; +│ ├── struct Point +│ ├── int x, y; +│ ├── Point(int x, int y) : x(x), y(y) +│ ├── class Animal +│ ├── public: +│ ├── Animal(const std::string &name) : name(name) +│ ├── virtual void speak() const +│ ├── virtual ~Animal() +│ ├── protected: +│ ├── std::string name; +│ ├── class Dog : public Animal +│ ├── public: +│ ├── Dog(const std::string &name) : Animal(name) +│ ├── void speak() const override +│ ├── class Cat : public Animal +│ ├── public: +│ ├── Cat(const std::string &name) : Animal(name) +│ ├── void speak() const override +│ ├── class CatDog: public Animal, public Cat, public Dog +│ ├── public: +│ ├── CatDog(const std::string &name) : Animal(name) +│ ├── int meow_bark() +│ ├── nb::bytes BuildRnnDescriptor(int input_size, int hidden_size, int +│ │ num_layers, +│ │ int batch_size, int max_seq_length, float +│ │ dropout, +│ │ bool bidirectional, bool cudnn_allow_tf32, +│ │ int workspace_size, int reserve_space_size) +│ ├── int main() +│ ├── enum ECarTypes +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV, +│ ├── Wagon +│ ├── ECarTypes GetPreferredCarType() +│ ├── enum ECarTypes : uint8_t +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV = 254, +│ ├── Hybrid +│ ├── enum class ECarTypes : uint8_t +│ ├── Sedan, +│ ├── Hatchback, +│ ├── SUV = 254, +│ ├── Hybrid +│ ├── void myFunction(string fname, int age) +│ ├── template T cos(T) +│ ├── template T sin(T) +│ ├── template T sqrt(T) +│ ├── template struct VLEN +│ ├── template class arr +│ ├── private: +│ ├── static T *ralloc(size_t num) +│ ├── static void dealloc(T *ptr) +│ ├── static T *ralloc(size_t num) +│ ├── static void dealloc(T *ptr) +│ ├── public: +│ ├── arr() : p(0), sz(0) +│ ├── arr(size_t n) : p(ralloc(n)), sz(n) +│ ├── arr(arr &&other) +│ │ : p(other.p), sz(other.sz) +│ ├── ~arr() +│ ├── void resize(size_t n) +│ ├── T &operator[](size_t idx) +│ ├── T *data() +│ ├── size_t size() const +│ ├── class Buffer +│ ├── private: +│ ├── void* ptr_; +│ ├── std::tuple quantize( +│ │ const array& w, +│ │ int group_size, +│ │ int bits, +│ │ StreamOrDevice s) +│ ├── #define PY_SSIZE_T_CLEAN +│ ├── #define PLATFORM_IS_X86 +│ ├── #define PLATFORM_WINDOWS +│ ├── #define GETCPUID(a, b, c, d, a_inp, c_inp) +│ ├── static int GetXCR0EAX() +│ ├── #define GETCPUID(a, b, c, d, a_inp, c_inp) +│ ├── static int GetXCR0EAX() +│ ├── asm("XGETBV" : "=a"(eax), "=d"(edx) : "c"(0)) +│ ├── static void ReportMissingCpuFeature(const char* name) +│ ├── static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args) +│ ├── static PyObject *CheckCpuFeatures(PyObject *self, PyObject *args) +│ ├── static PyMethodDef cpu_feature_guard_methods[] +│ ├── static struct PyModuleDef cpu_feature_guard_module +│ ├── #define EXPORT_SYMBOL __declspec(dllexport) +│ ├── #define EXPORT_SYMBOL __attribute__ ((visibility("default"))) +│ ├── EXPORT_SYMBOL PyMODINIT_FUNC PyInit_cpu_feature_guard(void) +│ ├── typedef struct +│ ├── GPT2Config config; +│ ├── ParameterTensors params; +│ ├── size_t param_sizes[NUM_PARAMETER_TENSORS]; +│ ├── float* params_memory; +│ ├── size_t num_parameters; +│ ├── ParameterTensors grads; +│ ├── float* grads_memory; +│ ├── float* m_memory; +│ ├── float* v_memory; +│ ├── ActivationTensors acts; +│ ├── size_t act_sizes[NUM_ACTIVATION_TENSORS]; +│ ├── float* acts_memory; +│ ├── size_t num_activations; +│ ├── ActivationTensors grads_acts; +│ ├── float* grads_acts_memory; +│ ├── int batch_size; +│ ├── int seq_len; +│ ├── int* inputs; +│ ├── int* targets; +│ ├── float mean_loss; +│ └── } GPT2; +├── 📄 cpp_examples_impl.cc (60 tokens, 10 lines) +│ ├── PYBIND11_MODULE(cpp_examples, m) +│ └── m.def("add", &add, "An example function to add two numbers.") +├── 📄 cpp_examples_impl.cu (37 tokens, 10 lines) +│ ├── template +│ │ T add(T a, T b) +│ └── template <> +│ int add(int a, int b) +├── 📄 cpp_examples_impl.h (22 tokens, 6 lines) +│ ├── template +│ │ T add(T a, T b) +│ └── template <> +│ int add(int, int) +├── 📄 edge_case.hpp (426 tokens, 28 lines) +├── 📄 fractal.thy (1,712 tokens, 147 lines) +├── 📄 Microsoft.PowerShell_profile.ps1 (3,346 tokens, 497 lines) +├── 📄 python_complex_class.py (10 tokens, 2 lines) +│ └── class Box(Space[NDArray[Any]]) +├── 📄 ramda__cloneRegExp.js (173 tokens, 9 lines) +│ └── export default function _cloneRegExp(pattern) +├── 📄 ramda_prop.js (646 tokens, 85 lines) +│ ├── /** +│ │ * Returns a function that when supplied an object returns the indicated +│ │ * property of that object, if it exists. +│ │ * @category Object +│ │ * @typedefn Idx = String | Int | Symbol +│ │ * @sig Idx -> {s: a} -> a | Undefined +│ │ * @param {String|Number} p The property name or array index +│ │ * @param {Object} obj The object to query +│ │ * @return {*} The value at `obj.p`. +│ │ */ +│ │ var prop = _curry2(function prop(p, obj) +│ ├── /** +│ │ * Solves equations of the form a * x = b +│ │ * @param {{ +│ │ * z: number +│ │ * }} x +│ │ */ +│ │ function foo(x) +│ ├── /** +│ │ * Deconstructs an array field from the input documents to output a +│ │ document for each element. +│ │ * Each output document is the input document with the value of the +│ │ array field replaced by the element. +│ │ * @category Object +│ │ * @sig String -> {k: [v]} -> [{k: v}] +│ │ * @param {String} key The key to determine which property of the object +│ │ should be unwound. +│ │ * @param {Object} object The object containing the list to unwind at +│ │ the property named by the key. +│ │ * @return {List} A list of new objects, each having the given key +│ │ associated to an item from the unwound list. +│ │ */ +│ │ var unwind = _curry2(function(key, object) +│ └── return _map(function(item) +├── 📄 tensorflow_flags.h (7,628 tokens, 668 lines) +│ ├── #define TENSORFLOW_CORE_CONFIG_FLAG_DEFS_H_ +│ ├── class Flags +│ ├── public: +│ ├── bool SetterForXlaAutoJitFlag(const string& value) +│ ├── bool SetterForXlaCallModuleDisabledChecks(const string& value) +│ ├── void AppendMarkForCompilationPassFlagsInternal(std::vector* +│ │ flag_list) +│ ├── void AllocateAndParseJitRtFlags() +│ ├── void AllocateAndParseFlags() +│ ├── void ResetFlags() +│ ├── bool SetXlaAutoJitFlagFromFlagString(const string& value) +│ ├── BuildXlaOpsPassFlags* GetBuildXlaOpsPassFlags() +│ ├── MarkForCompilationPassFlags* GetMarkForCompilationPassFlags() +│ ├── XlaSparseCoreFlags* GetXlaSparseCoreFlags() +│ ├── XlaDeviceFlags* GetXlaDeviceFlags() +│ ├── XlaOpsCommonFlags* GetXlaOpsCommonFlags() +│ ├── XlaCallModuleFlags* GetXlaCallModuleFlags() +│ ├── MlirCommonFlags* GetMlirCommonFlags() +│ ├── void ResetJitCompilerFlags() +│ ├── const JitRtFlags& GetJitRtFlags() +│ ├── ConfigProto::Experimental::MlirBridgeRollout GetMlirBridgeRolloutState( +│ │ std::optional config_proto) +│ ├── void AppendMarkForCompilationPassFlags(std::vector* flag_list) +│ ├── void DisableXlaCompilation() +│ ├── void EnableXlaCompilation() +│ ├── bool FailOnXlaCompilation() +│ ├── #define TF_PY_DECLARE_FLAG(flag_name) +│ └── PYBIND11_MODULE(flags_pybind, m) +├── 📄 test.f (181 tokens, 30 lines) +├── 📄 torch.rst (60 tokens, 8 lines) +│ ├── # libtorch (C++-only) +│ └── - Building libtorch using Python +└── 📄 yc.html (9,063 tokens, 169 lines) diff --git a/tests/golden/legacy/trees_v1/more_languages_group7.txt b/tests/golden/legacy/trees_v1/more_languages_group7.txt new file mode 100644 index 0000000..2f1fcc3 --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages_group7.txt @@ -0,0 +1,83 @@ +📁 group7 (1 folder, 5 files) +├── 📄 absurdly_huge.jsonl (8,347 tokens, 126 lines) +│ ├── SMILES: str +│ ├── Yield: float +│ ├── Temperature: int +│ ├── Pressure: float +│ ├── Solvent: str +│ ├── Success: bool +│ ├── Reaction_Conditions: dict +│ ├── Products: list +│ └── EdgeCasesMissed: None +├── 📄 angular_crud.ts (1,192 tokens, 148 lines) +│ ├── interface DBCommand +│ ├── export class IndexedDbService +│ ├── constructor() +│ ├── async create_connection({ db_name = 'client_db', table_name }: +│ │ DBCommand) +│ ├── upgrade(db) +│ ├── async create_model({ db_name, table_name, model }: DBCommand) +│ ├── verify_matching({ table_name, model }) +│ ├── async read_key({ db_name, table_name, key }: DBCommand) +│ ├── async update_model({ db_name, table_name, model }: DBCommand) +│ ├── verify_matching({ table_name, model }) +│ ├── async delete_key({ db_name, table_name, key }: DBCommand) +│ ├── async list_table({ +│ │ db_name, +│ │ table_name, +│ │ where, +│ │ }: DBCommand & { where?: { [key: string]: string | number } }) +│ └── async search_table(criteria: SearchCriteria) +├── 📄 structure.py (400 tokens, 92 lines) +│ ├── @runtime_checkable +│ │ class DataClass(Protocol) +│ ├── __dataclass_fields__: dict +│ ├── class MyInteger(Enum) +│ ├── ONE = 1 +│ ├── TWO = 2 +│ ├── THREE = 42 +│ ├── class MyString(Enum) +│ ├── AAA1 = "aaa" +│ ├── BB_B = """edge +│ │ case""" +│ ├── @dataclass(frozen=True, slots=True, kw_only=True) +│ │ class Tool +│ ├── name: str +│ ├── description: str +│ ├── input_model: DataClass +│ ├── output_model: DataClass +│ ├── def execute(self, *args, **kwargs) +│ ├── @property +│ │ def edge_case(self) -> str +│ ├── def should_still_see_me(self, x: bool = True) -> "Tool" +│ ├── @dataclass +│ │ class MyInput[T] +│ ├── name: str +│ ├── rank: MyInteger +│ ├── serial_n: int +│ ├── @dataclass +│ │ class Thingy +│ ├── is_edge_case: bool +│ ├── @dataclass +│ │ class MyOutput +│ ├── orders: str +│ ├── class MyTools(Enum) +│ ├── TOOL_A = Tool( +│ │ name="complicated", +│ │ description="edge case!", +│ │ input_model=MyInput[Thingy], +│ │ output_model=MyOutput, +│ │ ) +│ ├── TOOL_B = Tool( +│ │ name="""super +│ │ complicated +│ │ """, +│ │ description="edge case!", +│ │ input_model=MyInput, +│ │ output_model=MyOutput, +│ │ ) +│ ├── @final +│ │ class dtype(Generic[_DTypeScalar_co]) +│ └── names: None | tuple[builtins.str, ...] +├── 📄 test.wgsl (528 tokens, 87 lines) +└── 📄 test.metal (272 tokens, 34 lines) diff --git a/tests/golden/legacy/trees_v1/more_languages_group_lisp.txt b/tests/golden/legacy/trees_v1/more_languages_group_lisp.txt new file mode 100644 index 0000000..fefeb36 --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages_group_lisp.txt @@ -0,0 +1,5 @@ +📁 group_lisp (1 folder, 4 files) +├── 📄 clojure_test.clj (682 tokens, 85 lines) +├── 📄 LispTest.lisp (25 tokens, 6 lines) +├── 📄 racket_struct.rkt (14 tokens, 1 line) +└── 📄 test_scheme.scm (360 tokens, 44 lines) diff --git a/tests/golden/legacy/trees_v1/more_languages_group_todo.txt b/tests/golden/legacy/trees_v1/more_languages_group_todo.txt new file mode 100644 index 0000000..d13b512 --- /dev/null +++ b/tests/golden/legacy/trees_v1/more_languages_group_todo.txt @@ -0,0 +1,13 @@ +📁 group_todo (1 folder, 12 files) +├── 📄 AAPLShaders.metal (5,780 tokens, 566 lines) +├── 📄 crystal_test.cr (48 tokens, 15 lines) +├── 📄 dart_test.dart (108 tokens, 24 lines) +├── 📄 elixir_test.exs (39 tokens, 10 lines) +├── 📄 forward.frag (739 tokens, 87 lines) +├── 📄 forward.vert (359 tokens, 48 lines) +├── 📄 nodemon.json (118 tokens, 20 lines) +├── 📄 sas_test.sas (97 tokens, 22 lines) +├── 📄 test_setup_py.test (133 tokens, 24 lines) +├── 📄 testTypings.d.ts (158 tokens, 23 lines) +├── 📄 vba_test.bas (67 tokens, 16 lines) +└── 📄 wgsl_test.wgsl (94 tokens, 17 lines) diff --git a/tests/golden/legacy/trees_v1/multi_seed.txt b/tests/golden/legacy/trees_v1/multi_seed.txt new file mode 100644 index 0000000..c23221a --- /dev/null +++ b/tests/golden/legacy/trees_v1/multi_seed.txt @@ -0,0 +1,199 @@ +🌵 Root (2 folders, 17 files) +├── 📁 group1 (1 folder, 11 files) +│ ├── 📄 addamt.cobol (441 tokens, 40 lines) +│ ├── 📄 CUSTOMER-INVOICE.CBL (412 tokens, 60 lines) +│ ├── 📄 JavaTest.java (578 tokens, 86 lines) +│ ├── 📄 JuliaTest.jl (381 tokens, 63 lines) +│ ├── 📄 KotlinTest.kt (974 tokens, 171 lines) +│ ├── 📄 lesson.cbl (635 tokens, 78 lines) +│ ├── 📄 LuaTest.lua (83 tokens, 16 lines) +│ ├── 📄 ObjectiveCTest.m (62 tokens, 16 lines) +│ ├── 📄 OcamlTest.ml (49 tokens, 12 lines) +│ ├── 📄 test.js (757 tokens, 154 lines) +│ │ ├── class MyClass +│ │ ├── myMethod() +│ │ ├── async asyncMethod(a, b) +│ │ ├── methodWithDefaultParameters(a = 5, b = 10) +│ │ ├── multilineMethod( +│ │ │ c, +│ │ │ d +│ │ │ ) +│ │ ├── multilineMethodWithDefaults( +│ │ │ t = "tree", +│ │ │ p = "plus" +│ │ │ ) +│ │ ├── function myFunction(param1, param2) +│ │ ├── function multilineFunction( +│ │ │ param1, +│ │ │ param2 +│ │ │ ) +│ │ ├── const arrowFunction = () => +│ │ ├── const parametricArrow = (a, b) => +│ │ ├── function () +│ │ ├── function outerFunction(outerParam) +│ │ ├── function innerFunction(innerParam) +│ │ ├── innerFunction("inner") +│ │ ├── const myObject = { +│ │ ├── myMethod: function (stuff) +│ │ ├── let myArrowObject = { +│ │ ├── myArrow: ({ +│ │ │ a, +│ │ │ b, +│ │ │ c, +│ │ │ }) => +│ │ ├── const myAsyncArrowFunction = async () => +│ │ ├── function functionWithRestParameters(...args) +│ │ ├── const namedFunctionExpression = function myNamedFunction() +│ │ ├── const multilineArrowFunction = ( +│ │ │ a, +│ │ │ b +│ │ │ ) => +│ │ ├── function functionReturningFunction() +│ │ ├── return function () +│ │ ├── function destructuringOnMultipleLines({ +│ │ │ a, +│ │ │ b, +│ │ │ }) +│ │ ├── const arrowFunctionWithDestructuring = ({ a, b }) => +│ │ ├── const multilineDestructuringArrow = ({ +│ │ │ a, +│ │ │ b, +│ │ │ }) => +│ │ ├── async function asyncFunctionWithErrorHandling() +│ │ ├── class Car +│ │ ├── constructor(brand) +│ │ ├── present() +│ │ ├── class Model extends Car +│ │ ├── constructor(brand, mod) +│ │ ├── super(brand) +│ │ └── show() +│ └── 📄 test.ts (832 tokens, 165 lines) +│ ├── type MyType +│ ├── interface MyInterface +│ ├── class TsClass +│ ├── myMethod() +│ ├── myMethodWithArgs(param1: string, param2: number): void +│ ├── static myStaticMethod(param: T): T +│ ├── multilineMethod( +│ │ c: number, +│ │ d: number +│ │ ): number +│ ├── multilineMethodWithDefaults( +│ │ t: string = "tree", +│ │ p: string = "plus" +│ │ ): string +│ ├── export class AdvancedComponent implements MyInterface +│ ├── async myAsyncMethod( +│ │ a: string, +│ │ b: number, +│ │ c: string +│ │ ): Promise +│ ├── genericMethod( +│ │ arg1: T, +│ │ arg2: U +│ │ ): [T, U] +│ ├── export class TicketsComponent implements MyInterface +│ ├── async myAsyncMethod({ a, b, c }: { a: String; b: Number; c: String +│ │ }) +│ ├── function tsFunction() +│ ├── function tsFunctionSigned( +│ │ param1: number, +│ │ param2: number +│ │ ): void +│ ├── export default async function tsFunctionComplicated({ +│ │ a = 1 | 2, +│ │ b = "bob", +│ │ c = async () => "charlie", +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ c: () => Promise; +│ │ }): Promise +│ ├── const tsArrowFunctionSigned = ({ +│ │ a, +│ │ b, +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ }) => +│ ├── export const tsComplicatedArrow = async ({ +│ │ a = 1 | 2, +│ │ b = "bob", +│ │ c = async () => "charlie", +│ │ }: { +│ │ a: number; +│ │ b: string; +│ │ c: () => Promise; +│ │ }): Promise => +│ ├── const arrowFunction = () => +│ ├── const arrow = (a: String, b: Number) => +│ ├── const asyncArrowFunction = async () => +│ ├── const asyncArrow = async (a: String, b: Number) => +│ ├── let weirdArrow = () => +│ ├── const asyncPromiseArrow = async (): Promise => +│ ├── let myWeirdArrowSigned = (x: number): number => +│ ├── class Person +│ ├── constructor(private firstName: string, private lastName: string) +│ ├── getFullName(): string +│ ├── describe(): string +│ ├── class Employee extends Person +│ ├── constructor( +│ │ firstName: string, +│ │ lastName: string, +│ │ private jobTitle: string +│ │ ) +│ ├── super(firstName, lastName) +│ ├── describe(): string +│ ├── interface Shape +│ └── interface Square extends Shape +└── 📁 path_to_test (1 folder, 6 files) + ├── 📄 class_method_type.py (525 tokens, 101 lines) + │ ├── T = TypeVar("T") + │ ├── def parse_py(contents: str) -> List[str] + │ ├── class MyClass + │ ├── @staticmethod + │ │ def physical_element_aval(dtype) -> core.ShapedArray + │ ├── def my_method(self) + │ ├── @staticmethod + │ │ def my_typed_method(obj: dict) -> int + │ ├── def my_multiline_signature_method( + │ │ self, + │ │ alice: str = None, + │ │ bob: int = None, + │ │ ) -> tuple + │ ├── @lru_cache(maxsize=None) + │ │ def my_multiline_signature_function( + │ │ tree: tuple = (), + │ │ plus: str = "+", + │ │ ) -> tuple + │ ├── class LogLevelEnum(str, Enum) + │ ├── CRITICAL = "CRITICAL" + │ ├── GREETING = "GREETING" + │ ├── WARNING = "WARNING" + │ ├── ERROR = "ERROR" + │ ├── DEBUG = "DEBUG" + │ ├── INFO = "INFO" + │ ├── OFF = "OFF" + │ ├── class Thingy(BaseModel) + │ ├── metric: float + │ ├── @dataclass + │ │ class TestDataclass + │ ├── tree: str + │ ├── A = TypeVar("A", str, bytes) + │ ├── def omega_yikes(file: str, expected: List[str]) -> bool + │ ├── def ice[T](args: Iterable[T] = ()) + │ ├── class list[T] + │ ├── def __getitem__(self, index: int, /) -> T + │ ├── @classmethod + │ │ def from_code(cls, toolbox, code: bytes, score=None) -> "Thingy" + │ ├── @classmethod + │ │ def from_str(cls, toolbox, string: str, score=None) -> "Thingy" + │ └── class Router(hk.Module) + ├── 📄 empty.py (0 tokens, 0 lines) + ├── 📄 file.md (11 tokens, 2 lines) + │ └── # Hello, world! + ├── 📄 file.py (18 tokens, 3 lines) + │ └── def hello_world() + ├── 📄 file.txt (10 tokens, 2 lines) + └── 📄 version.py (13 tokens, 2 lines) + └── __version__ = "1.2.3" diff --git a/tests/golden/legacy/trees_v1/path_to_test.txt b/tests/golden/legacy/trees_v1/path_to_test.txt new file mode 100644 index 0000000..c1f5bc7 --- /dev/null +++ b/tests/golden/legacy/trees_v1/path_to_test.txt @@ -0,0 +1,51 @@ +📁 path_to_test (1 folder, 6 files) +├── 📄 class_method_type.py (525 tokens, 101 lines) +│ ├── T = TypeVar("T") +│ ├── def parse_py(contents: str) -> List[str] +│ ├── class MyClass +│ ├── @staticmethod +│ │ def physical_element_aval(dtype) -> core.ShapedArray +│ ├── def my_method(self) +│ ├── @staticmethod +│ │ def my_typed_method(obj: dict) -> int +│ ├── def my_multiline_signature_method( +│ │ self, +│ │ alice: str = None, +│ │ bob: int = None, +│ │ ) -> tuple +│ ├── @lru_cache(maxsize=None) +│ │ def my_multiline_signature_function( +│ │ tree: tuple = (), +│ │ plus: str = "+", +│ │ ) -> tuple +│ ├── class LogLevelEnum(str, Enum) +│ ├── CRITICAL = "CRITICAL" +│ ├── GREETING = "GREETING" +│ ├── WARNING = "WARNING" +│ ├── ERROR = "ERROR" +│ ├── DEBUG = "DEBUG" +│ ├── INFO = "INFO" +│ ├── OFF = "OFF" +│ ├── class Thingy(BaseModel) +│ ├── metric: float +│ ├── @dataclass +│ │ class TestDataclass +│ ├── tree: str +│ ├── A = TypeVar("A", str, bytes) +│ ├── def omega_yikes(file: str, expected: List[str]) -> bool +│ ├── def ice[T](args: Iterable[T] = ()) +│ ├── class list[T] +│ ├── def __getitem__(self, index: int, /) -> T +│ ├── @classmethod +│ │ def from_code(cls, toolbox, code: bytes, score=None) -> "Thingy" +│ ├── @classmethod +│ │ def from_str(cls, toolbox, string: str, score=None) -> "Thingy" +│ └── class Router(hk.Module) +├── 📄 empty.py (0 tokens, 0 lines) +├── 📄 file.md (11 tokens, 2 lines) +│ └── # Hello, world! +├── 📄 file.py (18 tokens, 3 lines) +│ └── def hello_world() +├── 📄 file.txt (10 tokens, 2 lines) +└── 📄 version.py (13 tokens, 2 lines) + └── __version__ = "1.2.3" From 5b41a12366d2f07b372394e148c2349518adc5bd Mon Sep 17 00:00:00 2001 From: Bion Howard Date: Tue, 9 Jun 2026 15:48:03 -0400 Subject: [PATCH 3/8] feat: Rust port of tree_plus (workspace: core + cli) tree_plus_core: model, deterministic rich-compatible renderer, natural sort (natsort os_sorted parity), fnmatch-compatible ignores, amortized globs, wc-parity counting, and component extraction: - tree-sitter extractors for Rust, Python, JS/TS, C/C++ with formatters that reproduce the legacy output (golden-parity tested, including catastrophic.c) and salvage ERROR regions on invalid syntax - regex extractors (LazyLock, linear-time engine only) for markers, Markdown/RST/txt, .env, requirements, Makefile/Justfile, Angular - native parsers for JSON/JSONL/YAML/TOML/CSV and SQLite (feature) tree_plus_cli: clap binary as tree_plus + tprs alias (PATH collision with the Python entry point), legacy flags, footer parity, real terminal-width detection. Parallel traversal with rayon behind a deterministic sort; arbitrary bytes never panic (robustness suite); criterion benches included. 55-85x faster than the Python implementation end to end. Co-Authored-By: Claude Fable 5 --- .gitignore | 4 +- Cargo.lock | 1096 +++++++++++++++++ Cargo.toml | 29 + crates/tree_plus_cli/Cargo.toml | 21 + crates/tree_plus_cli/src/main.rs | 173 +++ crates/tree_plus_cli/tests/cli.rs | 92 ++ crates/tree_plus_core/Cargo.toml | 36 + .../tree_plus_core/benches/tree_plus_bench.rs | 78 ++ crates/tree_plus_core/examples/dump_ast.rs | 55 + crates/tree_plus_core/examples/extract.rs | 16 + crates/tree_plus_core/src/config.rs | 39 + crates/tree_plus_core/src/count.rs | 203 +++ crates/tree_plus_core/src/extract/data.rs | 582 +++++++++ crates/tree_plus_core/src/extract/markdown.rs | 180 +++ crates/tree_plus_core/src/extract/markers.rs | 60 + crates/tree_plus_core/src/extract/mod.rs | 277 +++++ crates/tree_plus_core/src/extract/simple.rs | 216 ++++ .../src/extract/treesitter/c_cpp.rs | 591 +++++++++ .../src/extract/treesitter/mod.rs | 66 + .../src/extract/treesitter/python.rs | 346 ++++++ .../src/extract/treesitter/rust.rs | 312 +++++ .../src/extract/treesitter/typescript.rs | 420 +++++++ crates/tree_plus_core/src/ignore.rs | 307 +++++ crates/tree_plus_core/src/lib.rs | 29 + crates/tree_plus_core/src/model.rs | 125 ++ crates/tree_plus_core/src/render.rs | 347 ++++++ crates/tree_plus_core/src/sort.rs | 214 ++++ crates/tree_plus_core/src/walk.rs | 260 ++++ crates/tree_plus_core/tests/golden_parity.rs | 243 ++++ crates/tree_plus_core/tests/robustness.rs | 86 ++ 30 files changed, 6502 insertions(+), 1 deletion(-) create mode 100644 Cargo.lock create mode 100644 Cargo.toml create mode 100644 crates/tree_plus_cli/Cargo.toml create mode 100644 crates/tree_plus_cli/src/main.rs create mode 100644 crates/tree_plus_cli/tests/cli.rs create mode 100644 crates/tree_plus_core/Cargo.toml create mode 100644 crates/tree_plus_core/benches/tree_plus_bench.rs create mode 100644 crates/tree_plus_core/examples/dump_ast.rs create mode 100644 crates/tree_plus_core/examples/extract.rs create mode 100644 crates/tree_plus_core/src/config.rs create mode 100644 crates/tree_plus_core/src/count.rs create mode 100644 crates/tree_plus_core/src/extract/data.rs create mode 100644 crates/tree_plus_core/src/extract/markdown.rs create mode 100644 crates/tree_plus_core/src/extract/markers.rs create mode 100644 crates/tree_plus_core/src/extract/mod.rs create mode 100644 crates/tree_plus_core/src/extract/simple.rs create mode 100644 crates/tree_plus_core/src/extract/treesitter/c_cpp.rs create mode 100644 crates/tree_plus_core/src/extract/treesitter/mod.rs create mode 100644 crates/tree_plus_core/src/extract/treesitter/python.rs create mode 100644 crates/tree_plus_core/src/extract/treesitter/rust.rs create mode 100644 crates/tree_plus_core/src/extract/treesitter/typescript.rs create mode 100644 crates/tree_plus_core/src/ignore.rs create mode 100644 crates/tree_plus_core/src/lib.rs create mode 100644 crates/tree_plus_core/src/model.rs create mode 100644 crates/tree_plus_core/src/render.rs create mode 100644 crates/tree_plus_core/src/sort.rs create mode 100644 crates/tree_plus_core/src/walk.rs create mode 100644 crates/tree_plus_core/tests/golden_parity.rs create mode 100644 crates/tree_plus_core/tests/robustness.rs diff --git a/.gitignore b/.gitignore index 8ce7999..775e954 100644 --- a/.gitignore +++ b/.gitignore @@ -55,4 +55,6 @@ tests/version_increments/renamed_dry_run_version.py .mcp_server.pid # using uv sometimes we get a build folder -build/ \ No newline at end of file +build/ +# Rust workspace +/target/ diff --git a/Cargo.lock b/Cargo.lock new file mode 100644 index 0000000..1510d6f --- /dev/null +++ b/Cargo.lock @@ -0,0 +1,1096 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "ahash" +version = "0.8.12" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5a15f179cd60c4584b8a8c596927aadc462e27f2ca70c04e0071964a73ba7a75" +dependencies = [ + "cfg-if", + "once_cell", + "version_check", + "zerocopy", +] + +[[package]] +name = "aho-corasick" +version = "1.1.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ddd31a130427c27518df266943a5308ed92d4b226cc639f5a8f1002816174301" +dependencies = [ + "memchr", +] + +[[package]] +name = "anes" +version = "0.1.6" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "4b46cbb362ab8752921c97e041f5e366ee6297bd428a31275b9fcf1e380f7299" + +[[package]] +name = "anstream" +version = "1.0.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "824a212faf96e9acacdbd09febd34438f8f711fb84e09a8916013cd7815ca28d" +dependencies = [ + "anstyle", + "anstyle-parse", + "anstyle-query", + "anstyle-wincon", + "colorchoice", + "is_terminal_polyfill", + "utf8parse", +] + +[[package]] +name = "anstyle" +version = "1.0.14" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "940b3a0ca603d1eade50a4846a2afffd5ef57a9feac2c0e2ec2e14f9ead76000" + +[[package]] +name = "anstyle-parse" +version = "1.0.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "52ce7f38b242319f7cabaa6813055467063ecdc9d355bbb4ce0c68908cd8130e" +dependencies = [ + "utf8parse", +] + +[[package]] +name = "anstyle-query" +version = "1.1.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "40c48f72fd53cd289104fc64099abca73db4166ad86ea0b4341abe65af83dadc" +dependencies = [ + "windows-sys", +] + +[[package]] +name = "anstyle-wincon" +version = "3.0.11" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "291e6a250ff86cd4a820112fb8898808a366d8f9f58ce16d1f538353ad55747d" +dependencies = [ + "anstyle", + "once_cell_polyfill", + "windows-sys", +] + +[[package]] +name = "autocfg" +version = "1.5.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f2032f911046de80f0a198e0901378627c33f59ea0ac00e363d481118bd70a53" + +[[package]] +name = "bitflags" +version = "2.13.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b4388bee8683e3d04af747c73422af53102d2bd24d9eadb6cbc100baef4b43f8" + +[[package]] +name = "bumpalo" +version = "3.20.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "72f5acc6cb2ba439de613abc23857ec3d78374d8ed5ac84e9d11336e87da8649" + +[[package]] +name = "cast" +version = "0.3.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "37b2a672a2cb129a2e41c10b1224bb368f9f37a2b16b612598138befd7b37eb5" + +[[package]] +name = "cc" +version = "1.2.63" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "556e016178bb5662a08681bbe0f00f8e17631781a4dfc8c45e466e4b185ec27f" +dependencies = [ + "find-msvc-tools", + "shlex", +] + +[[package]] +name = "cfg-if" +version = "1.0.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801" + +[[package]] +name = "ciborium" +version = "0.2.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "42e69ffd6f0917f5c029256a24d0161db17cea3997d185db0d35926308770f0e" +dependencies = [ + "ciborium-io", + "ciborium-ll", + "serde", +] + +[[package]] +name = "ciborium-io" +version = "0.2.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "05afea1e0a06c9be33d539b876f1ce3692f4afea2cb41f740e7743225ed1c757" + +[[package]] +name = "ciborium-ll" +version = "0.2.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "57663b653d948a338bfb3eeba9bb2fd5fcfaecb9e199e87e1eda4d9e8b240fd9" +dependencies = [ + "ciborium-io", + "half", +] + +[[package]] +name = "clap" +version = "4.6.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1ddb117e43bbf7dacf0a4190fef4d345b9bad68dfc649cb349e7d17d28428e51" +dependencies = [ + "clap_builder", + "clap_derive", +] + +[[package]] +name = "clap_builder" +version = "4.6.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "714a53001bf66416adb0e2ef5ac857140e7dc3a0c48fb28b2f10762fc4b5069f" +dependencies = [ + "anstream", + "anstyle", + "clap_lex", + "strsim", +] + +[[package]] +name = "clap_derive" +version = "4.6.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f2ce8604710f6733aa641a2b3731eaa1e8b3d9973d5e3565da11800813f997a9" +dependencies = [ + "heck", + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "clap_lex" +version = "1.1.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c8d4a3bb8b1e0c1050499d1815f5ab16d04f0959b233085fb31653fbfc9d98f9" + +[[package]] +name = "colorchoice" +version = "1.0.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1d07550c9036bf2ae0c684c4297d503f838287c83c53686d05370d0e139ae570" + +[[package]] +name = "criterion" +version = "0.5.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f2b12d017a929603d80db1831cd3a24082f8137ce19c69e6447f54f5fc8d692f" +dependencies = [ + "anes", + "cast", + "ciborium", + "clap", + "criterion-plot", + "is-terminal", + "itertools", + "num-traits", + "once_cell", + "oorandom", + "plotters", + "rayon", + "regex", + "serde", + "serde_derive", + "serde_json", + "tinytemplate", + "walkdir", +] + +[[package]] +name = "criterion-plot" +version = "0.5.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6b50826342786a51a89e2da3a28f1c32b06e387201bc2d19791f622c673706b1" +dependencies = [ + "cast", + "itertools", +] + +[[package]] +name = "crossbeam-deque" +version = "0.8.6" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9dd111b7b7f7d55b72c0a6ae361660ee5853c9af73f70c3c2ef6858b950e2e51" +dependencies = [ + "crossbeam-epoch", + "crossbeam-utils", +] + +[[package]] +name = "crossbeam-epoch" +version = "0.9.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5b82ac4a3c2ca9c3460964f020e1402edd5753411d7737aa39c3714ad1b5420e" +dependencies = [ + "crossbeam-utils", +] + +[[package]] +name = "crossbeam-utils" +version = "0.8.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d0a5c400df2834b80a4c3327b3aad3a4c4cd4de0629063962b03235697506a28" + +[[package]] +name = "crunchy" +version = "0.2.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "460fbee9c2c2f33933d720630a6a0bac33ba7053db5344fac858d4b8952d77d5" + +[[package]] +name = "either" +version = "1.16.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "91622ff5e7162018101f2fea40d6ebf4a78bbe5a49736a2020649edf9693679e" + +[[package]] +name = "equivalent" +version = "1.0.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "877a4ace8713b0bcf2a4e7eec82529c029f1d0619886d18145fea96c3ffe5c0f" + +[[package]] +name = "errno" +version = "0.3.14" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "39cab71617ae0d63f51a36d69f866391735b51691dbda63cf6f96d042b63efeb" +dependencies = [ + "libc", + "windows-sys", +] + +[[package]] +name = "fallible-iterator" +version = "0.3.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2acce4a10f12dc2fb14a218589d4f1f62ef011b2d0cc4b3cb1bba8e94da14649" + +[[package]] +name = "fallible-streaming-iterator" +version = "0.1.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7360491ce676a36bf9bb3c56c1aa791658183a54d2744120f27285738d90465a" + +[[package]] +name = "find-msvc-tools" +version = "0.1.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5baebc0774151f905a1a2cc41989300b1e6fbb29aff0ceffa1064fdd3088d582" + +[[package]] +name = "futures-core" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7e3450815272ef58cec6d564423f6e755e25379b217b0bc688e295ba24df6b1d" + +[[package]] +name = "futures-task" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "037711b3d59c33004d3856fbdc83b99d4ff37a24768fa1be9ce3538a1cde4393" + +[[package]] +name = "futures-util" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "389ca41296e6190b48053de0321d02a77f32f8a5d2461dd38762c0593805c6d6" +dependencies = [ + "futures-core", + "futures-task", + "pin-project-lite", + "slab", +] + +[[package]] +name = "half" +version = "2.7.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6ea2d84b969582b4b1864a92dc5d27cd2b77b622a8d79306834f1be5ba20d84b" +dependencies = [ + "cfg-if", + "crunchy", + "zerocopy", +] + +[[package]] +name = "hashbrown" +version = "0.14.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e5274423e17b7c9fc20b6e7e208532f9b19825d82dfd615708b70edd83df41f1" +dependencies = [ + "ahash", +] + +[[package]] +name = "hashbrown" +version = "0.17.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ed5909b6e89a2db4456e54cd5f673791d7eca6732202bbf2a9cc504fe2f9b84a" + +[[package]] +name = "hashlink" +version = "0.9.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6ba4ff7128dee98c7dc9794b6a411377e1404dba1c97deb8d1a55297bd25d8af" +dependencies = [ + "hashbrown 0.14.5", +] + +[[package]] +name = "heck" +version = "0.5.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea" + +[[package]] +name = "hermit-abi" +version = "0.5.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "fc0fef456e4baa96da950455cd02c081ca953b141298e41db3fc7e36b1da849c" + +[[package]] +name = "indexmap" +version = "2.14.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d466e9454f08e4a911e14806c24e16fba1b4c121d1ea474396f396069cf949d9" +dependencies = [ + "equivalent", + "hashbrown 0.17.1", +] + +[[package]] +name = "is-terminal" +version = "0.4.17" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3640c1c38b8e4e43584d8df18be5fc6b0aa314ce6ebf51b53313d4306cca8e46" +dependencies = [ + "hermit-abi", + "libc", + "windows-sys", +] + +[[package]] +name = "is_terminal_polyfill" +version = "1.70.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a6cb138bb79a146c1bd460005623e142ef0181e3d0219cb493e02f7d08a35695" + +[[package]] +name = "itertools" +version = "0.10.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b0fd2260e829bddf4cb6ea802289de2f86d6a7a690192fbe91b3f46e0f2c8473" +dependencies = [ + "either", +] + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "js-sys" +version = "0.3.100" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f2025f20d7a4fa7785846e7b63d10a76d3f1cee98ee5cb79ea59703f95e42162" +dependencies = [ + "cfg-if", + "futures-util", + "wasm-bindgen", +] + +[[package]] +name = "libc" +version = "0.2.186" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "68ab91017fe16c622486840e4c83c9a37afeff978bd239b5293d61ece587de66" + +[[package]] +name = "libsqlite3-sys" +version = "0.30.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2e99fb7a497b1e3339bc746195567ed8d3e24945ecd636e3619d20b9de9e9149" +dependencies = [ + "cc", + "pkg-config", + "vcpkg", +] + +[[package]] +name = "linux-raw-sys" +version = "0.12.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "32a66949e030da00e8c7d4434b251670a91556f4144941d37452769c25d58a53" + +[[package]] +name = "memchr" +version = "2.8.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6b947ae49db0d222b1dbc6b113ce7248a3fc3a6ca21b696717bfc000ba4484d8" + +[[package]] +name = "num-traits" +version = "0.2.19" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "071dfc062690e90b734c0b2273ce72ad0ffa95f0c74596bc250dcfd960262841" +dependencies = [ + "autocfg", +] + +[[package]] +name = "once_cell" +version = "1.21.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9f7c3e4beb33f85d45ae3e3a1792185706c8e16d043238c593331cc7cd313b50" + +[[package]] +name = "once_cell_polyfill" +version = "1.70.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "384b8ab6d37215f3c5301a95a4accb5d64aa607f1fcb26a11b5303878451b4fe" + +[[package]] +name = "oorandom" +version = "11.1.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d6790f58c7ff633d8771f42965289203411a5e5c68388703c06e14f24770b41e" + +[[package]] +name = "pin-project-lite" +version = "0.2.17" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a89322df9ebe1c1578d689c92318e070967d1042b512afbe49518723f4e6d5cd" + +[[package]] +name = "pkg-config" +version = "0.3.33" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "19f132c84eca552bf34cab8ec81f1c1dcc229b811638f9d283dceabe58c5569e" + +[[package]] +name = "plotters" +version = "0.3.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5aeb6f403d7a4911efb1e33402027fc44f29b5bf6def3effcc22d7bb75f2b747" +dependencies = [ + "num-traits", + "plotters-backend", + "plotters-svg", + "wasm-bindgen", + "web-sys", +] + +[[package]] +name = "plotters-backend" +version = "0.3.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "df42e13c12958a16b3f7f4386b9ab1f3e7933914ecea48da7139435263a4172a" + +[[package]] +name = "plotters-svg" +version = "0.3.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "51bae2ac328883f7acdfea3d66a7c35751187f870bc81f94563733a154d7a670" +dependencies = [ + "plotters-backend", +] + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "rayon" +version = "1.12.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "fb39b166781f92d482534ef4b4b1b2568f42613b53e5b6c160e24cfbfa30926d" +dependencies = [ + "either", + "rayon-core", +] + +[[package]] +name = "rayon-core" +version = "1.13.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "22e18b0f0062d30d4230b2e85ff77fdfe4326feb054b9783a3460d8435c8ab91" +dependencies = [ + "crossbeam-deque", + "crossbeam-utils", +] + +[[package]] +name = "regex" +version = "1.12.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f1292b7759ae1cb9ec195452d1390a074f0cd8541ab7a5a8c31cd6db45d4a6ba" +dependencies = [ + "aho-corasick", + "memchr", + "regex-automata", + "regex-syntax", +] + +[[package]] +name = "regex-automata" +version = "0.4.14" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6e1dd4122fc1595e8162618945476892eefca7b88c52820e74af6262213cae8f" +dependencies = [ + "aho-corasick", + "memchr", + "regex-syntax", +] + +[[package]] +name = "regex-syntax" +version = "0.8.11" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d6f6ff9a378485b298a5286656da665ba74413d36db0979633275d2e708145d4" + +[[package]] +name = "rusqlite" +version = "0.32.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7753b721174eb8ff87a9a0e799e2d7bc3749323e773db92e0984debb00019d6e" +dependencies = [ + "bitflags", + "fallible-iterator", + "fallible-streaming-iterator", + "hashlink", + "libsqlite3-sys", + "smallvec", +] + +[[package]] +name = "rustix" +version = "1.1.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b6fe4565b9518b83ef4f91bb47ce29620ca828bd32cb7e408f0062e9930ba190" +dependencies = [ + "bitflags", + "errno", + "libc", + "linux-raw-sys", + "windows-sys", +] + +[[package]] +name = "rustversion" +version = "1.0.22" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b39cdef0fa800fc44525c84ccb54a029961a8215f9619753635a9c0d2538d46d" + +[[package]] +name = "ryu" +version = "1.0.23" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9774ba4a74de5f7b1c1451ed6cd5285a32eddb5cccb8cc655a4e50009e06477f" + +[[package]] +name = "same-file" +version = "1.0.6" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "93fc1dc3aaa9bfed95e02e6eadabb4baf7e3078b0bd1b4d7b6b0b68378900502" +dependencies = [ + "winapi-util", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", + "serde_derive", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.150" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e8014e44b4736ed0538adeecded0fce2a272f22dc9578a7eb6b2d9993c74cfb9" +dependencies = [ + "indexmap", + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "serde_spanned" +version = "0.6.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "bf41e0cfaf7226dca15e8197172c295a782857fcb97fad1808a166870dee75a3" +dependencies = [ + "serde", +] + +[[package]] +name = "serde_yaml" +version = "0.9.34+deprecated" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6a8b1a1a2ebf674015cc02edccce75287f1a0130d394307b36743c2f5d504b47" +dependencies = [ + "indexmap", + "itoa", + "ryu", + "serde", + "unsafe-libyaml", +] + +[[package]] +name = "shlex" +version = "2.0.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8fadd59c855ef2080decdef8ff161eb6661b86933c9d82e5ba29dc602a55aba" + +[[package]] +name = "slab" +version = "0.4.12" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0c790de23124f9ab44544d7ac05d60440adc586479ce501c1d6d7da3cd8c9cf5" + +[[package]] +name = "smallvec" +version = "1.15.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "67b1b7a3b5fe4f1376887184045fcf45c69e92af734b7aaddc05fb777b6fbd03" + +[[package]] +name = "streaming-iterator" +version = "0.1.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2b2231b7c3057d5e4ad0156fb3dc807d900806020c5ffa3ee6ff2c8c76fb8520" + +[[package]] +name = "strsim" +version = "0.11.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f" + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "terminal_size" +version = "0.4.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "230a1b821ccbd75b185820a1f1ff7b14d21da1e442e22c0863ea5f08771a8874" +dependencies = [ + "rustix", + "windows-sys", +] + +[[package]] +name = "thiserror" +version = "2.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "4288b5bcbc7920c07a1149a35cf9590a2aa808e0bc1eafaade0b80947865fbc4" +dependencies = [ + "thiserror-impl", +] + +[[package]] +name = "thiserror-impl" +version = "2.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ebc4ee7f67670e9b64d05fa4253e753e016c6c95ff35b89b7941d6b856dec1d5" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "tinytemplate" +version = "1.2.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "be4d6b5f19ff7664e8c98d03e2139cb510db9b0a60b55f8e8709b689d939b6bc" +dependencies = [ + "serde", + "serde_json", +] + +[[package]] +name = "toml" +version = "0.8.23" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "dc1beb996b9d83529a9e75c17a1686767d148d70663143c7854d8b4a09ced362" +dependencies = [ + "indexmap", + "serde", + "serde_spanned", + "toml_datetime", + "toml_edit", +] + +[[package]] +name = "toml_datetime" +version = "0.6.11" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "22cddaf88f4fbc13c51aebbf5f8eceb5c7c5a9da2ac40a13519eb5b0a0e8f11c" +dependencies = [ + "serde", +] + +[[package]] +name = "toml_edit" +version = "0.22.27" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41fe8c660ae4257887cf66394862d21dbca4a6ddd26f04a3560410406a2f819a" +dependencies = [ + "indexmap", + "serde", + "serde_spanned", + "toml_datetime", + "toml_write", + "winnow", +] + +[[package]] +name = "toml_write" +version = "0.1.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5d99f8c9a7727884afe522e9bd5edbfc91a3312b36a77b5fb8926e4c31a41801" + +[[package]] +name = "tree-sitter" +version = "0.25.10" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "78f873475d258561b06f1c595d93308a7ed124d9977cb26b148c2084a4a3cc87" +dependencies = [ + "cc", + "regex", + "regex-syntax", + "serde_json", + "streaming-iterator", + "tree-sitter-language", +] + +[[package]] +name = "tree-sitter-c" +version = "0.24.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a9b2eb57a55fed6b00812912e730b7a275cf4fe98bfd6a5d76263d4438371728" +dependencies = [ + "cc", + "tree-sitter-language", +] + +[[package]] +name = "tree-sitter-cpp" +version = "0.23.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "df2196ea9d47b4ab4a31b9297eaa5a5d19a0b121dceb9f118f6790ad0ab94743" +dependencies = [ + "cc", + "tree-sitter-language", +] + +[[package]] +name = "tree-sitter-javascript" +version = "0.25.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "68204f2abc0627a90bdf06e605f5c470aa26fdcb2081ea553a04bdad756693f5" +dependencies = [ + "cc", + "tree-sitter-language", +] + +[[package]] +name = "tree-sitter-language" +version = "0.1.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "009994f150cc0cd50ff54917d5bc8bffe8cad10ca10d81c34da2ec421ae61782" + +[[package]] +name = "tree-sitter-python" +version = "0.25.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6bf85fd39652e740bf60f46f4cda9492c3a9ad75880575bf14960f775cb74a1c" +dependencies = [ + "cc", + "tree-sitter-language", +] + +[[package]] +name = "tree-sitter-rust" +version = "0.24.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "439e577dbe07423ec2582ac62c7531120dbfccfa6e5f92406f93dd271a120e45" +dependencies = [ + "cc", + "tree-sitter-language", +] + +[[package]] +name = "tree-sitter-typescript" +version = "0.23.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6c5f76ed8d947a75cc446d5fccd8b602ebf0cde64ccf2ffa434d873d7a575eff" +dependencies = [ + "cc", + "tree-sitter-language", +] + +[[package]] +name = "tree_plus_cli" +version = "2.0.0-alpha.1" +dependencies = [ + "clap", + "terminal_size", + "tree_plus_core", +] + +[[package]] +name = "tree_plus_core" +version = "2.0.0-alpha.1" +dependencies = [ + "criterion", + "rayon", + "regex", + "rusqlite", + "serde", + "serde_json", + "serde_yaml", + "thiserror", + "toml", + "tree-sitter", + "tree-sitter-c", + "tree-sitter-cpp", + "tree-sitter-javascript", + "tree-sitter-python", + "tree-sitter-rust", + "tree-sitter-typescript", + "unicode-width", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "unicode-width" +version = "0.2.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b4ac048d71ede7ee76d585517add45da530660ef4390e49b098733c6e897f254" + +[[package]] +name = "unsafe-libyaml" +version = "0.2.11" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "673aac59facbab8a9007c7f6108d11f63b603f7cabff99fabf650fea5c32b861" + +[[package]] +name = "utf8parse" +version = "0.2.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821" + +[[package]] +name = "vcpkg" +version = "0.2.15" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "accd4ea62f7bb7a82fe23066fb0957d48ef677f6eeb8215f372f52e48bb32426" + +[[package]] +name = "version_check" +version = "0.9.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a" + +[[package]] +name = "walkdir" +version = "2.5.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "29790946404f91d9c5d06f9874efddea1dc06c5efe94541a7d6863108e3a5e4b" +dependencies = [ + "same-file", + "winapi-util", +] + +[[package]] +name = "wasm-bindgen" +version = "0.2.123" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a254a4b10c19a76f09a27640e7ffbf9bc30bf67e16a3bf28aaefa4920fe81563" +dependencies = [ + "cfg-if", + "once_cell", + "rustversion", + "wasm-bindgen-macro", + "wasm-bindgen-shared", +] + +[[package]] +name = "wasm-bindgen-macro" +version = "0.2.123" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "24a40fc75b0ec6f3746ceb10d36f53a93dcd68a93b11b6445983945d79eba0dc" +dependencies = [ + "quote", + "wasm-bindgen-macro-support", +] + +[[package]] +name = "wasm-bindgen-macro-support" +version = "0.2.123" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "908f34bd9b9ce3d4caf07b72dfab63d61504d156856c6bd3cd87fa350cf3985b" +dependencies = [ + "bumpalo", + "proc-macro2", + "quote", + "syn", + "wasm-bindgen-shared", +] + +[[package]] +name = "wasm-bindgen-shared" +version = "0.2.123" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7acbf7616c27b194bbb550bf77ed0c2c3e5b7fd1260a93082b95fb7f47959b92" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "web-sys" +version = "0.3.100" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6e0871acf327f283dc6da28a1696cdc64fb355ba9f935d052021fa77f35cce69" +dependencies = [ + "js-sys", + "wasm-bindgen", +] + +[[package]] +name = "winapi-util" +version = "0.1.11" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c2a7b1c03c876122aa43f3020e6c3c3ee5c05081c9a00739faf7503aeba10d22" +dependencies = [ + "windows-sys", +] + +[[package]] +name = "windows-link" +version = "0.2.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f0805222e57f7521d6a62e36fa9163bc891acd422f971defe97d64e70d0a4fe5" + +[[package]] +name = "windows-sys" +version = "0.61.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ae137229bcbd6cdf0f7b80a31df61766145077ddf49416a728b02cb3921ff3fc" +dependencies = [ + "windows-link", +] + +[[package]] +name = "winnow" +version = "0.7.15" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "df79d97927682d2fd8adb29682d1140b343be4ac0f08fd68b7765d9c059d3945" +dependencies = [ + "memchr", +] + +[[package]] +name = "zerocopy" +version = "0.8.50" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3b065d4f0e55f82fae73202e189638116a87c55ab6b8e6c2721e13dd9d854ad1" +dependencies = [ + "zerocopy-derive", +] + +[[package]] +name = "zerocopy-derive" +version = "0.8.50" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0b631b19d36a892ab55420c92dbc83ccd79274f25be714855d3074aa71cab639" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/Cargo.toml b/Cargo.toml new file mode 100644 index 0000000..afff7ec --- /dev/null +++ b/Cargo.toml @@ -0,0 +1,29 @@ +[workspace] +resolver = "2" +members = ["crates/tree_plus_core", "crates/tree_plus_cli"] + +[workspace.package] +version = "2.0.0-alpha.1" +edition = "2021" +license = "MIT OR Apache-2.0" +repository = "https://github.com/bionicles/tree_plus" + +[workspace.dependencies] +regex = "1" +serde = { version = "1", features = ["derive"] } +serde_json = { version = "1", features = ["preserve_order"] } +toml = { version = "0.8", features = ["preserve_order"] } +serde_yaml = "0.9" +unicode-width = "0.2" +rayon = "1.10" +thiserror = "2" +clap = { version = "4.5", features = ["derive"] } +tree-sitter = "0.25" +tree-sitter-rust = "0.24" +tree-sitter-python = "0.25" +tree-sitter-javascript = "0.25" +tree-sitter-typescript = "0.23" +tree-sitter-c = "0.24" +tree-sitter-cpp = "0.23" +rusqlite = { version = "0.32", features = ["bundled"] } +criterion = "0.5" diff --git a/crates/tree_plus_cli/Cargo.toml b/crates/tree_plus_cli/Cargo.toml new file mode 100644 index 0000000..44b5c50 --- /dev/null +++ b/crates/tree_plus_cli/Cargo.toml @@ -0,0 +1,21 @@ +[package] +name = "tree_plus_cli" +description = "A `tree` util enhanced with tokens, lines, and components." +version.workspace = true +edition.workspace = true +license.workspace = true +repository.workspace = true + +[[bin]] +name = "tree_plus" +path = "src/main.rs" + +# alias to avoid PATH collision with the Python tree_plus entry point +[[bin]] +name = "tprs" +path = "src/main.rs" + +[dependencies] +tree_plus_core = { path = "../tree_plus_core" } +clap.workspace = true +terminal_size = "0.4" diff --git a/crates/tree_plus_cli/src/main.rs b/crates/tree_plus_cli/src/main.rs new file mode 100644 index 0000000..f0f5af5 --- /dev/null +++ b/crates/tree_plus_cli/src/main.rs @@ -0,0 +1,173 @@ +//! tree_plus CLI: a `tree` util enhanced with tokens, lines, and components. +//! +//! Flag-compatible with the legacy Python click CLI for the version-1 scope +//! (local filesystem mode). Web/HN modes are deferred; see +//! docs/rust-port-differences.md. + +use std::time::Instant; + +use clap::Parser; + +use tree_plus_core::count::TokenizerName; +use tree_plus_core::{from_seeds, render_to_string, TreePlusConfig}; + +const VERSION: &str = env!("CARGO_PKG_VERSION"); + +#[derive(Parser, Debug)] +#[command( + name = "tree_plus", + disable_version_flag = true, + about = "A `tree` util enhanced with tokens, lines, and components.", + after_help = format!("v({VERSION}) --- https://github.com/bionicles/tree_plus/blob/main/README.md"), +)] +struct Cli { + /// Patterns to ignore, in quotes: -i "*.java" + #[arg(short = 'i', long = "ignore")] + ignore: Vec, + + /// Override DEFAULT_IGNORE (includes ignored content): -o -i "*.java" + #[arg(short = 'o', long = "override")] + override_ignore: bool, + + /// Patterns to find, in quotes: -g "*.rs" + #[arg(short = 'g', long = "glob")] + glob: Vec, + + /// Print the version and exit. + #[arg(short = 'v', long = "version")] + version: bool, + + /// Enables debug output. + #[arg(short = 'd', long = "debug")] + debug: bool, + + /// DISABLE Syntax Highlighting. (The Rust port renders plain text; this + /// flag is accepted for compatibility and controls legacy markup escaping.) + #[arg(short = 's', long = "syntax")] + syntax: bool, + + /// Omit module components. (False) + #[arg(short = 'c', long = "concise")] + concise: bool, + + /// A shorthand for tiktoken with the 'gpt-4o' tokenizer (unsupported in + /// the Rust port: errors explicitly). + #[arg(short = 't', long = "tiktoken")] + tiktoken: bool, + + /// Name of the tokenizer to use; only 'wc' is supported in the Rust port. + #[arg(short = 'T', long = "tokenizer-name")] + tokenizer_name: Option, + + /// Regex timeout in seconds (accepted for compatibility; the Rust port + /// does not need regex timeouts). + #[arg(long = "timeout")] + timeout: Option, + + /// Paths or globs to map. + paths: Vec, +} + +fn main() { + let start = Instant::now(); + let cli = Cli::parse(); + + if cli.version { + println!("{VERSION}"); + return; + } + + let tokenizer = match (cli.tiktoken, cli.tokenizer_name.as_deref()) { + (false, None) | (true, Some("wc")) | (false, Some("wc")) => TokenizerName::Wc, + (true, None) | (_, Some("gpt4o")) | (_, Some("gpt-4o")) => { + eprintln!( + "error: the gpt-4o tokenizer is not supported by the Rust port; \ + use the default word-count tokenizer (wc)" + ); + std::process::exit(2); + } + (_, Some(other)) => { + eprintln!("error: unsupported tokenizer {other:?} (only 'wc' is supported)"); + std::process::exit(2); + } + }; + + let config = TreePlusConfig { + ignore: cli.ignore.clone(), + override_ignore: cli.override_ignore, + globs: cli.glob.clone(), + concise: cli.concise, + tokenizer, + syntax: false, + max_tokens: tree_plus_core::config::MAX_TOKENS, + }; + if cli.debug { + eprintln!( + "tree_plus main received paths={:?} ignore={:?} glob={:?}", + cli.paths, cli.ignore, cli.glob + ); + } + let _ = cli.timeout; // compatibility no-op + let _ = cli.syntax; // plain-text output; highlighting deferred + + let root = from_seeds(&cli.paths, &config); + + let width = terminal_width(); + print!("{}", render_to_string(&root, width)); + + let og_ignore = format_tuple(&cli.ignore); + let globs = format_tuple(&cli.glob); + let paths = format_tuple(&cli.paths); + let mut footer = format!("\ntree_plus v({VERSION}) ignore={og_ignore} globs={globs}"); + if cli.concise { + footer.push_str(&format!(" concise=True paths={paths}")); + } else { + footer.push_str(&format!(" syntax={} paths={paths}", py_bool(!cli.syntax))); + } + footer.push_str(&format!( + "\n{} in {:.2} second(s).", + root.stats(), + start.elapsed().as_secs_f64() + )); + println!("{footer}"); +} + +/// Python tuple repr for the footer line, e.g. `('a',)` / `()` / `None`. +fn format_tuple(items: &[String]) -> String { + if items.is_empty() { + return "()".to_string(); + } + let inner: Vec = items.iter().map(|s| format!("'{s}'")).collect(); + if items.len() == 1 { + format!("({},)", inner[0]) + } else { + format!("({})", inner.join(", ")) + } +} + +fn py_bool(b: bool) -> &'static str { + if b { + "True" + } else { + "False" + } +} + +fn terminal_width() -> usize { + use std::io::IsTerminal; + // honor the legacy README-update width override + if std::env::var("TREE_PLUS_UPDATE_README").as_deref() == Ok("YES") { + return 128; + } + // piped output renders at the legacy capture width + if !std::io::stdout().is_terminal() { + return tree_plus_core::DEFAULT_WIDTH; + } + // real terminal width, like the legacy rich Console + if let Some((terminal_size::Width(w), _)) = terminal_size::terminal_size() { + if w > 0 { + return w as usize; + } + } + tree_plus_core::DEFAULT_WIDTH +} diff --git a/crates/tree_plus_cli/tests/cli.rs b/crates/tree_plus_cli/tests/cli.rs new file mode 100644 index 0000000..50ecc6e --- /dev/null +++ b/crates/tree_plus_cli/tests/cli.rs @@ -0,0 +1,92 @@ +//! CLI integration tests mirroring the README examples (local fs mode). + +use std::path::Path; +use std::process::Command; + +fn repo_root() -> std::path::PathBuf { + Path::new(env!("CARGO_MANIFEST_DIR")) + .join("../..") + .canonicalize() + .unwrap() +} + +fn tree_plus(args: &[&str]) -> std::process::Output { + Command::new(env!("CARGO_BIN_EXE_tree_plus")) + .args(args) + .current_dir(repo_root()) + .output() + .expect("run tree_plus") +} + +#[test] +fn version_flag() { + let out = tree_plus(&["-v"]); + assert!(out.status.success()); + let stdout = String::from_utf8_lossy(&out.stdout); + assert!(stdout.trim().starts_with("2."), "version: {stdout}"); +} + +#[test] +fn renders_a_directory_with_stats_footer() { + let out = tree_plus(&["tests/path_to_test"]); + assert!(out.status.success()); + let stdout = String::from_utf8_lossy(&out.stdout); + assert!(stdout.contains("📁 path_to_test (1 folder, 6 files)")); + assert!(stdout.contains("└── def hello_world()")); + assert!(stdout.contains("folder(s), 6 file(s),")); + assert!(stdout.contains("second(s).")); +} + +#[test] +fn ignore_flag_excludes_extension() { + let out = tree_plus(&["-i", "*.py", "tests/path_to_test"]); + assert!(out.status.success()); + let stdout = String::from_utf8_lossy(&out.stdout); + assert!(!stdout.contains(".py (")); + assert!(stdout.contains("file.md")); +} + +#[test] +fn glob_flag_filters() { + let out = tree_plus(&["-g", "*.rs", "tests/more_languages"]); + assert!(out.status.success()); + let stdout = String::from_utf8_lossy(&out.stdout); + assert!(stdout.contains(".rs (")); + assert!(!stdout.contains(".java")); +} + +#[test] +fn concise_mode_omits_components() { + let out = tree_plus(&["-c", "tests/path_to_test"]); + assert!(out.status.success()); + let stdout = String::from_utf8_lossy(&out.stdout); + assert!(stdout.contains("📁 path_to_test")); + assert!(!stdout.contains("def hello_world()")); + assert!(stdout.contains("concise=True")); +} + +#[test] +fn glob_seed_pattern() { + let out = tree_plus(&["tree_plus_src/*.py"]); + assert!(out.status.success()); + let stdout = String::from_utf8_lossy(&out.stdout); + assert!(stdout.contains("🌀 tree_plus_src/*.py (")); + assert!(stdout.contains("engine.py")); +} + +#[test] +fn unsupported_tokenizer_errors_explicitly() { + let out = tree_plus(&["-t", "tests/path_to_test"]); + assert!(!out.status.success()); + let stderr = String::from_utf8_lossy(&out.stderr); + assert!(stderr.contains("not supported")); +} + +#[test] +fn renders_this_repository() { + let out = tree_plus(&["-c", "."]); + assert!(out.status.success()); + let stdout = String::from_utf8_lossy(&out.stdout); + assert!(stdout.contains("📁 tree_plus (")); + assert!(stdout.contains("README.md")); +} diff --git a/crates/tree_plus_core/Cargo.toml b/crates/tree_plus_core/Cargo.toml new file mode 100644 index 0000000..89fead3 --- /dev/null +++ b/crates/tree_plus_core/Cargo.toml @@ -0,0 +1,36 @@ +[package] +name = "tree_plus_core" +description = "Core library for tree_plus: map repositories into trees with components, tokens, and lines" +version.workspace = true +edition.workspace = true +license.workspace = true +repository.workspace = true + +[features] +default = ["sqlite"] +sqlite = ["dep:rusqlite"] + +[dependencies] +regex.workspace = true +serde.workspace = true +serde_json.workspace = true +toml.workspace = true +serde_yaml.workspace = true +unicode-width.workspace = true +rayon.workspace = true +thiserror.workspace = true +tree-sitter.workspace = true +tree-sitter-rust.workspace = true +tree-sitter-python.workspace = true +tree-sitter-javascript.workspace = true +tree-sitter-typescript.workspace = true +tree-sitter-c.workspace = true +tree-sitter-cpp.workspace = true +rusqlite = { workspace = true, optional = true } + +[dev-dependencies] +criterion.workspace = true + +[[bench]] +name = "tree_plus_bench" +harness = false diff --git a/crates/tree_plus_core/benches/tree_plus_bench.rs b/crates/tree_plus_core/benches/tree_plus_bench.rs new file mode 100644 index 0000000..1600a49 --- /dev/null +++ b/crates/tree_plus_core/benches/tree_plus_bench.rs @@ -0,0 +1,78 @@ +//! Criterion benchmarks: traversal+counting, extraction, and full renders. +//! +//! Run: cargo bench -p tree_plus_core +//! Benchmark any repository: cargo bench -p tree_plus_core -- --quick +//! (set TREE_PLUS_BENCH_PATH=/path/to/repo to point at a different tree). + +use std::path::{Path, PathBuf}; + +use criterion::{criterion_group, criterion_main, Criterion}; +use std::hint::black_box; + +use tree_plus_core::{extract_components, from_seeds, render_to_string, TreePlusConfig}; + +fn repo_root() -> PathBuf { + Path::new(env!("CARGO_MANIFEST_DIR")) + .join("../..") + .canonicalize() + .unwrap() +} + +fn bench_target() -> PathBuf { + std::env::var("TREE_PLUS_BENCH_PATH") + .map(PathBuf::from) + .unwrap_or_else(|_| repo_root()) +} + +fn full_tree(c: &mut Criterion) { + let target = bench_target(); + let seeds = vec![target.to_string_lossy().into_owned()]; + let config = TreePlusConfig::default(); + c.bench_function("full_tree_with_components", |b| { + b.iter(|| { + let tree = from_seeds(black_box(&seeds), &config); + black_box(render_to_string(&tree, 80)) + }) + }); +} + +fn concise_tree(c: &mut Criterion) { + let target = bench_target(); + let seeds = vec![target.to_string_lossy().into_owned()]; + let config = TreePlusConfig { + concise: true, + ..Default::default() + }; + c.bench_function("concise_tree_traversal_counting", |b| { + b.iter(|| { + let tree = from_seeds(black_box(&seeds), &config); + black_box(render_to_string(&tree, 80)) + }) + }); +} + +fn single_file_extraction(c: &mut Criterion) { + let root = repo_root(); + let cases = [ + ("rust", "tests/more_languages/group4/rust_test.rs"), + ("python", "tests/path_to_test/class_method_type.py"), + ("typescript", "tests/more_languages/group1/test.ts"), + ( + "c_pathological", + "tests/more_languages/group6/catastrophic.c", + ), + ("markdown", "tests/more_languages/group_todo/todo.md"), + ]; + for (name, rel) in cases { + let path = root.join(rel); + if !path.exists() { + continue; + } + c.bench_function(&format!("extract_{name}"), |b| { + b.iter(|| black_box(extract_components(black_box(&path), false))) + }); + } +} + +criterion_group!(benches, full_tree, concise_tree, single_file_extraction); +criterion_main!(benches); diff --git a/crates/tree_plus_core/examples/dump_ast.rs b/crates/tree_plus_core/examples/dump_ast.rs new file mode 100644 index 0000000..436cea7 --- /dev/null +++ b/crates/tree_plus_core/examples/dump_ast.rs @@ -0,0 +1,55 @@ +//! Debug helper: dump a tree-sitter AST outline. +//! Usage: cargo run -p tree_plus_core --example dump_ast -- [start_line] [end_line] + +fn main() { + let args: Vec = std::env::args().skip(1).collect(); + let path = &args[0]; + let lo: usize = args.get(1).and_then(|s| s.parse().ok()).unwrap_or(0); + let hi: usize = args + .get(2) + .and_then(|s| s.parse().ok()) + .unwrap_or(usize::MAX); + let content = std::fs::read_to_string(path).unwrap(); + let ext = format!( + ".{}", + std::path::Path::new(path) + .extension() + .unwrap() + .to_string_lossy() + ); + let language: tree_sitter::Language = match ext.as_str() { + ".cpp" | ".cc" | ".cu" | ".hpp" => tree_sitter_cpp::LANGUAGE.into(), + ".c" | ".h" => tree_sitter_c::LANGUAGE.into(), + ".py" => tree_sitter_python::LANGUAGE.into(), + ".rs" => tree_sitter_rust::LANGUAGE.into(), + ".js" | ".ts" => tree_sitter_typescript::LANGUAGE_TYPESCRIPT.into(), + other => panic!("no grammar for {other}"), + }; + let mut parser = tree_sitter::Parser::new(); + parser.set_language(&language).unwrap(); + let tree = parser.parse(&content, None).unwrap(); + let mut stack = vec![(tree.root_node(), 0usize)]; + while let Some((node, depth)) = stack.pop() { + let row = node.start_position().row + 1; + if row >= lo && row <= hi { + let text: String = content[node.byte_range()] + .chars() + .take(60) + .collect::() + .replace('\n', "\\n"); + println!( + "{}{} [{}..{}] {:?}", + " ".repeat(depth), + node.kind(), + row, + node.end_position().row + 1, + text + ); + } + let mut cursor = node.walk(); + let children: Vec<_> = node.named_children(&mut cursor).collect(); + for child in children.into_iter().rev() { + stack.push((child, depth + 1)); + } + } +} diff --git a/crates/tree_plus_core/examples/extract.rs b/crates/tree_plus_core/examples/extract.rs new file mode 100644 index 0000000..2152e15 --- /dev/null +++ b/crates/tree_plus_core/examples/extract.rs @@ -0,0 +1,16 @@ +//! Debug helper: print extracted components as JSON for given paths. +//! Usage: cargo run -p tree_plus_core --example extract -- ... + +fn main() { + for arg in std::env::args().skip(1) { + let components = tree_plus_core::extract_components(std::path::Path::new(&arg), false); + println!( + "{}", + serde_json::to_string_pretty(&serde_json::json!({ + "path": arg, + "components": components, + })) + .unwrap() + ); + } +} diff --git a/crates/tree_plus_core/src/config.rs b/crates/tree_plus_core/src/config.rs new file mode 100644 index 0000000..346d50c --- /dev/null +++ b/crates/tree_plus_core/src/config.rs @@ -0,0 +1,39 @@ +//! Configuration for tree building (CLI flags map onto this). + +use crate::count::TokenizerName; + +/// Maximum token count before a file's components are skipped +/// (legacy `MAX_TOKENS`). +pub const MAX_TOKENS: u64 = 1_000_000; + +#[derive(Debug, Clone)] +pub struct TreePlusConfig { + /// User ignore patterns (unioned with defaults unless `override_ignore`). + pub ignore: Vec, + /// Use only user ignore patterns. + pub override_ignore: bool, + /// Glob patterns to keep (e.g. `*.rs`). + pub globs: Vec, + /// Skip component extraction entirely. + pub concise: bool, + /// Tokenizer (only `Wc` is supported in the Rust port). + pub tokenizer: TokenizerName, + /// Legacy syntax-highlighting flag; affects Rust enum markup escaping. + pub syntax: bool, + /// Maximum tokens before skipping component extraction. + pub max_tokens: u64, +} + +impl Default for TreePlusConfig { + fn default() -> Self { + TreePlusConfig { + ignore: Vec::new(), + override_ignore: false, + globs: Vec::new(), + concise: false, + tokenizer: TokenizerName::Wc, + syntax: false, + max_tokens: MAX_TOKENS, + } + } +} diff --git a/crates/tree_plus_core/src/count.rs b/crates/tree_plus_core/src/count.rs new file mode 100644 index 0000000..dcc21be --- /dev/null +++ b/crates/tree_plus_core/src/count.rs @@ -0,0 +1,203 @@ +//! Token and line counting, matching the legacy `wc -ml`-based counter. +//! +//! Legacy default tokenizer ("wc"): `n_lines` = newline count, `n_tokens` = +//! character count / 4 (1 token ~= 4 chars). Characters are Unicode scalar +//! values for valid UTF-8; invalid bytes each count as one character, which +//! matches `wc -m` behavior in a UTF-8 locale closely enough for parity. + +use std::collections::HashSet; +use std::path::Path; +use std::sync::LazyLock; + +#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)] +pub struct TokenLineCount { + pub n_tokens: u64, + pub n_lines: u64, +} + +/// Tokenizer selection mirroring the legacy `TokenizerName` enum. +#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)] +pub enum TokenizerName { + #[default] + Wc, + /// Recognized but unsupported in the Rust port; selecting it is an + /// explicit error per the port requirements. + Gpt4o, +} + +/// Extensions whose contents are never counted (legacy `extensions_not_to_count`). +static EXTENSIONS_NOT_TO_COUNT: LazyLock> = LazyLock::new(|| { + [ + ".stl", + ".obj", + ".gcode", + ".3mf", + ".amf", + ".ply", + ".f3d", + ".iges", + ".igs", + "step", + ".stp", + ".vrml", + ".wrl", + ".7z", + ".aac", + ".ai", + ".avi", + ".bak", + ".bin", + ".bz2", + ".chk", + ".class", + ".csv", + ".d", + ".dat", + ".db", + ".dll", + ".doc", + ".docx", + ".dylib", + ".ear", + ".eps", + ".exe", + ".flac", + ".flv", + ".framework", + ".gdoc", + ".gif", + ".gsheet", + ".gz", + ".img", + ".ipa", + ".iso", + ".jar", + ".jpg", + ".jpeg", + ".lock", + ".log", + ".mov", + ".mp3", + ".mp4", + ".nib", + ".node", + ".o", + ".odg", + ".pack", + ".pdf", + ".png", + ".ppt", + ".pptx", + ".psd", + ".pyc", + ".pyo", + ".pyd", + ".rar", + ".rlib", + ".rmeta", + ".so", + ".sqlite", + ".storyboardc", + ".swp", + ".tar", + ".tml", + ".wav", + ".war", + ".wmv", + ".xcarchive", + ".xlsx", + ".zip", + ".zst", + ] + .into_iter() + .collect() +}); + +/// Lowercased extension including the dot, like `os.path.splitext` (lowered). +pub fn dot_extension(path: &Path) -> String { + let name = path.file_name().and_then(|n| n.to_str()).unwrap_or(""); + // os.path.splitext: leading dots do not start an extension + let trimmed = name.trim_start_matches('.'); + let n_leading = name.len() - trimmed.len(); + match trimmed.rfind('.') { + Some(i) => name[n_leading + i..].to_lowercase(), + None => String::new(), + } +} + +/// Count tokens and lines for raw file bytes. +pub fn count_tokens_lines_bytes(bytes: &[u8]) -> TokenLineCount { + let n_lines = bytes.iter().filter(|&&b| b == b'\n').count() as u64; + // count UTF-8 scalar values; invalid bytes count 1 each (lossy) + let mut n_chars: u64 = 0; + let mut rest = bytes; + while !rest.is_empty() { + match std::str::from_utf8(rest) { + Ok(s) => { + n_chars += s.chars().count() as u64; + break; + } + Err(e) => { + let valid = e.valid_up_to(); + n_chars += std::str::from_utf8(&rest[..valid]) + .map(|s| s.chars().count() as u64) + .unwrap_or(0); + let skip = e.error_len().unwrap_or(rest.len() - valid).max(1); + n_chars += 1; // the invalid sequence counts as one char + rest = &rest[valid + skip..]; + } + } + } + TokenLineCount { + n_tokens: n_chars / 4, + n_lines, + } +} + +/// Whether counting is skipped for this path (legacy `extensions_not_to_count`). +pub fn skip_counting(path: &Path) -> bool { + EXTENSIONS_NOT_TO_COUNT.contains(dot_extension(path).as_str()) +} + +/// Count tokens and lines in a file; `None` when counting is skipped. +pub fn count_tokens_lines(path: &Path) -> Option { + if path.is_dir() || skip_counting(path) { + return None; + } + let bytes = std::fs::read(path).ok()?; + Some(count_tokens_lines_bytes(&bytes)) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn wc_style_counts() { + let c = count_tokens_lines_bytes(b"def hello_world():\n print(\"hello world\")\n"); + // 44 chars // 4 = 11 tokens, 2 newlines + assert_eq!(c.n_lines, 2); + assert_eq!(c.n_tokens, 11); + } + + #[test] + fn empty_is_zero() { + let c = count_tokens_lines_bytes(b""); + assert_eq!(c, TokenLineCount::default()); + } + + #[test] + fn extension_skips() { + assert!(skip_counting(Path::new("x/a.csv"))); + assert!(skip_counting(Path::new("a.SQLITE"))); + assert!(!skip_counting(Path::new("a.py"))); + } + + #[test] + fn dot_extension_rules() { + assert_eq!(dot_extension(Path::new("a.PY")), ".py"); + assert_eq!(dot_extension(Path::new(".env")), ""); + assert_eq!(dot_extension(Path::new(".env.test")), ".test"); + assert_eq!(dot_extension(Path::new("Makefile")), ""); + } +} diff --git a/crates/tree_plus_core/src/extract/data.rs b/crates/tree_plus_core/src/extract/data.rs new file mode 100644 index 0000000..b9fdeb2 --- /dev/null +++ b/crates/tree_plus_core/src/extract/data.rs @@ -0,0 +1,582 @@ +//! Structured data extractors: JSON family, JSONL, YAML, TOML, CSV, SQLite. +//! Output formatting matches the legacy Python f-strings (including Python +//! `str()` / `repr()` rendering of scalars where the legacy code relied on it). + +use serde_json::Value as Json; + +use super::{simple::splitlines, ExtractError, ExtractResult}; + +fn parse_err(e: E) -> ExtractError { + ExtractError::Parse(e.to_string()) +} + +/// Python `str()` of a JSON scalar/value (used inside legacy f-strings). +fn py_str(v: &Json) -> String { + match v { + Json::Null => "None".to_string(), + Json::Bool(b) => if *b { "True" } else { "False" }.to_string(), + Json::Number(n) => n.to_string(), + Json::String(s) => s.clone(), + other => py_repr(other), + } +} + +/// Python `repr()` of a JSON value (lists/dicts inside legacy f-strings). +fn py_repr(v: &Json) -> String { + match v { + Json::Null => "None".to_string(), + Json::Bool(b) => if *b { "True" } else { "False" }.to_string(), + Json::Number(n) => n.to_string(), + Json::String(s) => format!("'{}'", s.replace('\\', "\\\\").replace('\'', "\\'")), + Json::Array(items) => { + let inner: Vec = items.iter().map(py_repr).collect(); + format!("[{}]", inner.join(", ")) + } + Json::Object(map) => { + let inner: Vec = map + .iter() + .map(|(k, v)| format!("'{}': {}", k, py_repr(v))) + .collect(); + format!("{{{}}}", inner.join(", ")) + } + } +} + +/// Legacy `parse_jsonl`: first line's keys with Python type names. +pub fn extract_jsonl(content: &str) -> ExtractResult { + let first_line = content.split('\n').next().unwrap_or(""); + let data: Json = serde_json::from_str(first_line).map_err(parse_err)?; + let Json::Object(map) = data else { + return Err(ExtractError::Parse("jsonl root is not an object".into())); + }; + Ok(map + .iter() + .map(|(key, value)| match value { + Json::Array(_) => format!("{key}: list"), + Json::Object(_) => format!("{key}: dict"), + Json::Null => format!("{key}: None"), + Json::Bool(_) => format!("{key}: bool"), + Json::String(_) => format!("{key}: str"), + Json::Number(n) => { + if n.is_f64() { + format!("{key}: float") + } else { + format!("{key}: int") + } + } + }) + .collect()) +} + +/// Legacy `parse_json_schema`: $schema/type/title/description keys. +pub fn extract_json_schema(content: &str) -> ExtractResult { + let parsed: Json = serde_json::from_str(content).map_err(parse_err)?; + let mut keepers = Vec::new(); + for key in ["$schema", "type", "title", "description"] { + if let Some(v) = parsed.get(key) { + keepers.push(format!("{key}: {}", py_str(v))); + } + } + Ok(keepers) +} + +/// Legacy `parse_package_json`: name, version, scripts. +pub fn extract_package_json(content: &str) -> ExtractResult { + let parsed: Json = serde_json::from_str(content).map_err(parse_err)?; + let mut keepers = Vec::new(); + keepers.push(match parsed.get("name") { + Some(name) => format!("name: '{}'", py_str(name)), + None => "name: ?".to_string(), + }); + keepers.push(match parsed.get("version") { + Some(version) => format!("version: {}", py_str(version)), + None => "version: ?".to_string(), + }); + if let Some(Json::Object(scripts)) = parsed.get("scripts") { + if !scripts.is_empty() { + keepers.push("scripts:".to_string()); + for (key, value) in scripts { + keepers.push(format!(" {key}: '{}'", py_str(value))); + } + } + } + Ok(keepers) +} + +/// Legacy `parse_json_rpc`. +pub fn extract_json_rpc(content: &str) -> ExtractResult { + let data: Json = serde_json::from_str(content).map_err(parse_err)?; + let get = |k: &str| data.get(k).map(py_str).unwrap_or_else(|| "N/A".to_string()); + let mut components = vec![ + format!("jsonrpc: {}", get("jsonrpc")), + format!("method: {}", get("method")), + "params:".to_string(), + ]; + if let Some(Json::Object(params)) = data.get("params") { + for (param, value) in params { + components.push(format!(" {param}: {}", py_str(value))); + } + } + components.push(format!("id: {}", get("id"))); + Ok(components) +} + +/// Legacy `parse_openrpc_json`. +pub fn extract_openrpc_json(content: &str) -> ExtractResult { + let data: Json = serde_json::from_str(content).map_err(parse_err)?; + let str_or = + |v: Option<&Json>, default: &str| v.map(py_str).unwrap_or_else(|| default.to_string()); + let mut components = vec![ + format!("openrpc: {}", str_or(data.get("openrpc"), "N/A")), + "info:".to_string(), + ]; + let info = data.get("info"); + components.push(format!( + " title: {}", + str_or(info.and_then(|i| i.get("title")), "N/A") + )); + components.push(format!( + " version: {}", + str_or(info.and_then(|i| i.get("version")), "N/A") + )); + components.push("methods:".to_string()); + if let Some(Json::Array(methods)) = data.get("methods") { + for method in methods { + let name = str_or(method.get("name"), "None"); + let desc = str_or(method.get("description"), "No description"); + components.push(format!(" {name}: {desc}")); + components.push(" params:".to_string()); + if let Some(Json::Array(params)) = method.get("params") { + for param in params { + let ptype = str_or(param.get("schema").and_then(|s| s.get("type")), "N/A"); + let pname = str_or(param.get("name"), "None"); + components.push(format!(" - {pname}: {ptype}")); + } + } + let result = str_or(method.get("result").and_then(|r| r.get("name")), "N/A"); + let result_desc = str_or( + method.get("result").and_then(|r| r.get("description")), + "No description", + ); + components.push(format!(" result: {result} = {result_desc}")); + } + } + Ok(components) +} + +/// Legacy `parse_csv`: header columns, or a count when there are > 11. +pub fn extract_csv(content: &str) -> ExtractResult { + let rows = splitlines(content); + let first = rows + .first() + .ok_or_else(|| ExtractError::Parse("empty csv".into()))?; + let columns: Vec = first.split(',').map(|c| c.to_string()).collect(); + if columns.len() <= 11 { + Ok(columns) + } else { + Ok(vec![format!("{} columns", columns.len())]) + } +} + +// --- YAML --------------------------------------------------------------- + +use serde::Deserialize; +use serde_yaml::Value as Yaml; + +fn yaml_str(v: &Yaml) -> String { + match v { + Yaml::Null => "None".to_string(), + Yaml::Bool(b) => if *b { "True" } else { "False" }.to_string(), + Yaml::Number(n) => n.to_string(), + Yaml::String(s) => s.clone(), + other => format!("{other:?}"), // legacy would have raised; rare + } +} + +fn yaml_get<'a>(v: &'a Yaml, key: &str) -> Option<&'a Yaml> { + v.as_mapping() + .and_then(|m| m.get(Yaml::String(key.to_string()))) +} + +fn yaml_has(v: &Yaml, key: &str) -> bool { + yaml_get(v, key).is_some() +} + +/// Legacy `parse_yml` with category detection. +pub fn extract_yml(content: &str) -> ExtractResult { + let mut docs: Vec = Vec::new(); + for doc in serde_yaml::Deserializer::from_str(content) { + let value = Yaml::deserialize(doc).map_err(parse_err)?; + docs.push(value); + } + // python yaml.safe_load_all yields None for empty docs; an empty file + // yields no docs at all + if docs.is_empty() { + return Ok(Vec::new()); + } + let first = &docs[0]; + if yaml_has(first, "apiVersion") && yaml_has(first, "kind") && yaml_has(first, "metadata") { + return Ok(extract_k8s(&docs).unwrap_or_else(unsupported_yaml)); + } + if first + .as_sequence() + .map(|seq| { + seq.iter() + .any(|item| item.as_mapping().is_some_and(|_| yaml_has(item, "name"))) + }) + .unwrap_or(false) + { + return Ok(extract_ansible(&docs).unwrap_or_else(unsupported_yaml)); + } + if yaml_has(first, "name") && yaml_has(first, "jobs") { + return Ok(extract_github(&docs).unwrap_or_else(unsupported_yaml)); + } + if yaml_has(first, "openapi") || yaml_has(first, "swagger") { + return Ok(extract_openapi(&docs).unwrap_or_else(unsupported_yaml)); + } + Ok(vec!["Unsupported YAML Category".to_string()]) +} + +fn unsupported_yaml(_: ()) -> Vec { + vec!["Unsupported YAML Category".to_string()] +} + +fn extract_k8s(docs: &[Yaml]) -> Result, ()> { + let mut result = Vec::new(); + for yml in docs { + let api = yaml_get(yml, "apiVersion").ok_or(())?; + let kind = yaml_get(yml, "kind").ok_or(())?; + let name = yaml_get(yml, "metadata") + .and_then(|m| yaml_get(m, "name")) + .ok_or(())?; + result.push(format!( + "{}.{} -> {}", + yaml_str(api), + yaml_str(kind), + yaml_str(name) + )); + } + Ok(result) +} + +fn extract_ansible(docs: &[Yaml]) -> Result, ()> { + let mut result = Vec::new(); + for yml in docs { + let seq = yml.as_sequence().ok_or(())?; + for item in seq { + if item.as_mapping().is_some() { + let name = yaml_get(item, "name").map(yaml_str).unwrap_or_default(); + result.push(name); + } + } + } + Ok(result) +} + +fn extract_github(docs: &[Yaml]) -> Result, ()> { + let yml = &docs[0]; + let mut result = vec![yaml_str(yaml_get(yml, "name").ok_or(())?)]; + let jobs = yaml_get(yml, "jobs") + .and_then(|j| j.as_mapping()) + .ok_or(())?; + for (job_key, job) in jobs { + result.push(format!(" job: {}", yaml_str(job_key))); + if let Some(steps) = yaml_get(job, "steps").and_then(|s| s.as_sequence()) { + for step in steps { + if let Some(name) = yaml_get(step, "name") { + result.push(format!(" - {}", yaml_str(name))); + } + } + } + } + Ok(result) +} + +fn extract_openapi(docs: &[Yaml]) -> Result, ()> { + let yml = &docs[0]; + let mut components = Vec::new(); + let openapi = yaml_get(yml, "openapi") + .or_else(|| yaml_get(yml, "swagger")) + .map(yaml_str) + .unwrap_or_else(|| "N/A".to_string()); + components.push(format!("openapi: {openapi}")); + let info = yaml_get(yml, "info"); + let title = info + .and_then(|i| yaml_get(i, "title")) + .map(yaml_str) + .unwrap_or_else(|| "N/A".to_string()); + components.push(format!(" title: {title}")); + let description = info + .and_then(|i| yaml_get(i, "description")) + .map(yaml_str) + .unwrap_or_default(); + let first_sentence = if description.is_empty() { + "N/A".to_string() + } else { + // legacy: regex.split(r"\.\s|\.\n", description, maxsplit=1)[0] + "." + let split_at = description + .char_indices() + .find(|&(i, c)| { + c == '.' + && description[i + 1..] + .chars() + .next() + .is_some_and(|n| n.is_whitespace()) + }) + .map(|(i, _)| i); + match split_at { + Some(i) => format!("{}.", &description[..i]), + None => format!("{description}."), + } + }; + components.push(format!(" description: {first_sentence}")); + let version = info + .and_then(|i| yaml_get(i, "version")) + .map(yaml_str) + .unwrap_or_else(|| "N/A".to_string()); + components.push(format!(" version: {version}")); + if let Some(servers) = yaml_get(yml, "servers").and_then(|s| s.as_sequence()) { + if !servers.is_empty() { + components.push("servers:".to_string()); + for server in servers { + let url = yaml_get(server, "url") + .map(yaml_str) + .unwrap_or_else(|| "N/A".to_string()); + components.push(format!(" - url: {url}")); + } + } + } + if let Some(paths) = yaml_get(yml, "paths").and_then(|p| p.as_mapping()) { + if !paths.is_empty() { + components.push("paths:".to_string()); + for (path, methods) in paths { + components.push(format!(" '{}':", yaml_str(path))); + if let Some(methods) = methods.as_mapping() { + for (method, details) in methods { + let op = yaml_get(details, "operationId") + .map(yaml_str) + .unwrap_or_else(|| "N/A".to_string()); + let summary = yaml_get(details, "summary") + .map(yaml_str) + .unwrap_or_else(|| "N/A".to_string()); + components.push(format!( + " {} ({op}): {summary}", + yaml_str(method).to_uppercase() + )); + } + } + } + } + } + Ok(components) +} + +// --- TOML --------------------------------------------------------------- + +use toml::Value as Toml; + +fn toml_str(v: &Toml) -> String { + match v { + Toml::String(s) => s.clone(), + Toml::Integer(i) => i.to_string(), + Toml::Float(f) => f.to_string(), + Toml::Boolean(b) => if *b { "True" } else { "False" }.to_string(), + other => other.to_string(), + } +} + +/// Legacy `format_dependency`. +fn format_dependency(name: &str, details: &Toml) -> String { + match details { + Toml::String(s) => format!(" {name} {s}"), + Toml::Table(t) => { + let version = t.get("version").map(toml_str).unwrap_or_default(); + let features: Vec = t + .get("features") + .and_then(|f| f.as_array()) + .map(|arr| arr.iter().map(toml_str).collect()) + .unwrap_or_default(); + if features.is_empty() { + format!(" {name} {version}") + } else { + format!(" {name} {version} (features: {})", features.join(", ")) + } + } + _ => format!(" {name}"), + } +} + +/// Legacy `parse_cargo_toml`. +pub fn extract_cargo_toml(content: &str) -> ExtractResult { + let data: Toml = content.parse().map_err(parse_err)?; + let mut components = Vec::new(); + if let Some(package) = data.get("package") { + for key in ["name", "version", "description", "license"] { + let value = package + .get(key) + .map(toml_str) + .unwrap_or_else(|| "N/A".to_string()); + components.push(format!("{key}: {value}")); + } + } + if let Some(Toml::Table(deps)) = data.get("dependencies") { + components.push("dependencies:".to_string()); + for (dep, details) in deps { + components.push(format_dependency(dep, details)); + } + } + Ok(components) +} + +/// Legacy `parse_pyproject_toml`. +pub fn extract_pyproject_toml(content: &str) -> ExtractResult { + let data: Toml = content.parse().map_err(parse_err)?; + let mut components = Vec::new(); + let project = data.get("project"); + if let Some(project) = project { + for key in ["name", "version", "description"] { + let value = project + .get(key) + .map(toml_str) + .unwrap_or_else(|| "N/A".to_string()); + components.push(format!("{key}: {value}")); + } + if let Some(classifiers) = project.get("classifiers").and_then(|c| c.as_array()) { + for classifier in classifiers { + let c = toml_str(classifier); + if c.contains("License ::") { + components.push(c); + } + } + } + if let Some(deps) = project.get("dependencies").and_then(|d| d.as_array()) { + components.push("dependencies:".to_string()); + for dep in deps { + components.push(format!(" {}", toml_str(dep))); + } + } + } + Ok(components) +} + +// --- SQLite --------------------------------------------------------------- + +/// Legacy `parse_db`: tables and columns from a SQLite database. +#[cfg(feature = "sqlite")] +pub fn extract_sqlite(path: &std::path::Path) -> ExtractResult { + let conn = + rusqlite::Connection::open_with_flags(path, rusqlite::OpenFlags::SQLITE_OPEN_READ_ONLY) + .map_err(parse_err)?; + let mut stmt = conn + .prepare("SELECT name FROM sqlite_master WHERE type='table';") + .map_err(parse_err)?; + let tables: Vec = stmt + .query_map([], |row| row.get::<_, String>(0)) + .map_err(parse_err)? + .filter_map(Result::ok) + .collect(); + let mut components = Vec::new(); + for table in tables { + components.push(format!("{table} table:")); + let mut info = conn + .prepare(&format!("PRAGMA table_info({table});")) + .map_err(parse_err)?; + let rows = info + .query_map([], |row| { + Ok(( + row.get::<_, String>(1)?, // name + row.get::<_, String>(2)?, // type + row.get::<_, i64>(3)?, // notnull + row.get::<_, i64>(5)?, // pk + )) + }) + .map_err(parse_err)?; + for row in rows.filter_map(Result::ok) { + let (name, ctype, not_null, pk) = row; + let mut desc = format!(" {name} {}", ctype.to_lowercase()); + if pk == 1 { + desc.push_str(" primary key"); + } + if not_null == 1 && pk != 1 { + desc.push_str(" not null"); + } + components.push(desc); + } + } + Ok(components) +} + +#[cfg(not(feature = "sqlite"))] +pub fn extract_sqlite(_path: &std::path::Path) -> ExtractResult { + Ok(Vec::new()) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn jsonl_types() { + let line = r#"{"a": 1, "b": "x", "c": [1], "d": {"k": 1}, "e": null, "f": 1.5, "g": true}"#; + assert_eq!( + extract_jsonl(line).unwrap(), + vec!["a: int", "b: str", "c: list", "d: dict", "e: None", "f: float", "g: bool"] + ); + } + + #[test] + fn package_json() { + let content = r#"{"name": "pkg", "version": "1.0.0", "scripts": {"test": "jest"}}"#; + assert_eq!( + extract_package_json(content).unwrap(), + vec![ + "name: 'pkg'", + "version: 1.0.0", + "scripts:", + " test: 'jest'" + ] + ); + } + + #[test] + fn csv_columns() { + assert_eq!(extract_csv("a,b,c\n1,2,3\n").unwrap(), vec!["a", "b", "c"]); + let wide = (0..12).map(|i| i.to_string()).collect::>().join(","); + assert_eq!(extract_csv(&wide).unwrap(), vec!["12 columns"]); + } + + #[test] + fn github_yaml() { + let content = + "name: CI\njobs:\n build:\n steps:\n - name: checkout\n - run: make\n"; + assert_eq!( + extract_yml(content).unwrap(), + vec!["CI", " job: build", " - checkout"] + ); + } + + #[test] + fn unsupported_yaml_category() { + assert_eq!( + extract_yml("just: a mapping\n").unwrap(), + vec!["Unsupported YAML Category"] + ); + } + + #[test] + fn cargo_toml() { + let content = "[package]\nname = \"x\"\nversion = \"0.1.0\"\n[dependencies]\nserde = { version = \"1\", features = [\"derive\"] }\nregex = \"1\"\n"; + assert_eq!( + extract_cargo_toml(content).unwrap(), + vec![ + "name: x", + "version: 0.1.0", + "description: N/A", + "license: N/A", + "dependencies:", + " serde 1 (features: derive)", + " regex 1", + ] + ); + } +} diff --git a/crates/tree_plus_core/src/extract/markdown.rs b/crates/tree_plus_core/src/extract/markdown.rs new file mode 100644 index 0000000..2278f25 --- /dev/null +++ b/crates/tree_plus_core/src/extract/markdown.rs @@ -0,0 +1,180 @@ +//! Markdown and plain-text extraction (legacy `parse_md`, `parse_txt`). + +use std::sync::LazyLock; + +use regex::Regex; + +const MARKDOWN_LANGUAGES: &[&str] = &["md", "markdown", "mdx", "mdc"]; + +static TASK_RE: LazyLock = + LazyLock::new(|| Regex::new(r"(-\s*\[ *[xX]?\])\s*(.*)").unwrap()); +static URL_RE: LazyLock = + LazyLock::new(|| Regex::new(r#"\s*\(.+\)|.+"#).unwrap()); +static LINK_RE: LazyLock = LazyLock::new(|| Regex::new(r"\(.*?\)").unwrap()); + +/// Extract headers and tasks from Markdown (legacy `parse_md`). +pub fn extract_md(content: &str) -> Vec { + let mut headers_and_tasks: Vec = Vec::new(); + // (task_text, include_flag) of checked ancestors + let mut checked_ancestors: Vec<(String, bool)> = Vec::new(); + let mut code_block_stack: Vec = Vec::new(); + + for line in content.split('\n') { + let line = line.strip_suffix('\r').unwrap_or(line); + let stripped_line = line.trim(); + if let Some(after_fence) = stripped_line.strip_prefix("```") { + if !code_block_stack.is_empty() { + if stripped_line == "```" { + code_block_stack.pop(); + } else { + code_block_stack.push(after_fence.trim().to_string()); + } + } else { + code_block_stack.push(after_fence.trim().to_string()); + } + continue; + } + + let in_opaque_block = code_block_stack + .last() + .map(|lang| !MARKDOWN_LANGUAGES.contains(&lang.as_str())) + .unwrap_or(false); + if in_opaque_block { + continue; + } + + if line.starts_with('#') { + let line = line.trim_start(); + let clean = URL_RE.replace_all(line, ""); + let clean = LINK_RE.replace_all(&clean, ""); + let clean = clean.trim(); + if clean.trim_start_matches('#').trim().is_empty() { + continue; + } + headers_and_tasks.push(clean.to_string()); + } else if TASK_RE.is_match(line.trim_start()) + && TASK_RE + .captures(line.trim_start()) + .map(|c| c.get(0).map(|m| m.start() == 0).unwrap_or(false)) + .unwrap_or(false) + { + let lstripped = line.trim_start(); + let indent_level = line.len() - lstripped.len(); + if let Some(caps) = TASK_RE.captures(lstripped) { + let task_text = caps.get(2).map(|m| m.as_str()).unwrap_or(""); + let is_checked = line.contains("[x]") || line.contains("[X]"); + let task = format!( + "{}{}{}", + " ".repeat(indent_level), + if is_checked { "- [x] " } else { "- [ ] " }, + task_text + ); + checked_ancestors.retain(|a| a.0.len() < task.len()); + if is_checked { + checked_ancestors.push((task, false)); + } else { + for a in checked_ancestors.iter_mut() { + a.1 = true; + } + headers_and_tasks.extend(checked_ancestors.iter().map(|a| a.0.clone())); + headers_and_tasks.push(task); + } + } + } + } + headers_and_tasks +} + +static CHECKBOX_RE: LazyLock = + LazyLock::new(|| Regex::new(r"-\s*\[\s*([^Xx])?\s*\]\s*(.+)").unwrap()); + +/// Extract unchecked checkboxes from plain text (legacy `parse_txt`). +pub fn extract_txt(content: &str) -> Vec { + content + .split('\n') + .filter_map(|line| { + CHECKBOX_RE + .captures(line) + .and_then(|caps| caps.get(2)) + .map(|m| format!("- [ ] {}", m.as_str().trim())) + }) + .collect() +} + +/// Extract RST headings (legacy `parse_rst`): a content line followed by an +/// underline of `=` (heading, `# `) or `-` (subheading, `- `). +pub fn extract_rst(content: &str) -> Vec { + let lines: Vec<&str> = content.split('\n').collect(); + let mut components = Vec::new(); + let mut i = 0; + while i + 1 < lines.len() { + let underline = lines[i + 1].trim_end_matches('\r'); + let is_sub = !underline.is_empty() && underline.chars().all(|c| c == '-'); + let is_head = !underline.is_empty() && underline.chars().all(|c| c == '='); + if is_sub || is_head { + let text = lines[i].trim_end_matches('\r'); + let prefix = if is_head { "# " } else { "- " }; + components.push(format!("{prefix}{text}")); + i += 2; + } else { + i += 1; + } + } + components +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn headers_and_code_blocks() { + let content = "# Title\n```rust\n# not a header\n```\n## Sub\n"; + assert_eq!(extract_md(content), vec!["# Title", "## Sub"]); + } + + #[test] + fn markdown_code_blocks_stay_transparent() { + let content = "# A\n```md\n# Inner\n```\n"; + assert_eq!(extract_md(content), vec!["# A", "# Inner"]); + } + + #[test] + fn checked_ancestors_logic() { + let content = "- [x] done parent\n - [ ] open child\n"; + assert_eq!( + extract_md(content), + vec!["- [x] done parent", " - [ ] open child"] + ); + } + + #[test] + fn checked_without_open_children_hidden() { + let content = "- [x] done alone\n- [ ] open\n"; + assert_eq!(extract_md(content), vec!["- [ ] open"]); + } + + #[test] + fn header_links_removed() { + assert_eq!( + extract_md("# Hello [link](https://x.y) world\n"), + vec!["# Hello [link] world"] + ); + } + + #[test] + fn txt_checkboxes() { + assert_eq!( + extract_txt("- [ ] do it\n- [x] done\ntext\n"), + vec!["- [ ] do it"] + ); + } + + #[test] + fn rst_headings() { + assert_eq!( + extract_rst("Title\n=====\n\nSub\n---\n"), + vec!["# Title", "- Sub"] + ); + } +} diff --git a/crates/tree_plus_core/src/extract/markers.rs b/crates/tree_plus_core/src/extract/markers.rs new file mode 100644 index 0000000..ed065dc --- /dev/null +++ b/crates/tree_plus_core/src/extract/markers.rs @@ -0,0 +1,60 @@ +//! TODO / BUG / NOTE marker extraction (legacy `parse_markers`). + +use std::sync::LazyLock; + +use regex::Regex; + +static MARKER_RE: LazyLock = LazyLock::new(|| { + Regex::new(r"(?PBUG|TODO|NOTE)(?P ?\([@\w ]+\) ?)?: (?P.*)").unwrap() +}); + +/// Extract `MARK: message` lines from content. +pub fn extract_markers(content: &str) -> Vec { + MARKER_RE + .captures_iter(content) + .filter_map(|caps| { + let mark = caps.name("mark")?.as_str(); + let msg = caps.name("msg")?.as_str(); + if msg.is_empty() { + // legacy: falsy msg ("" only, since trailing newline is excluded) + return None; + } + Some(format!("{}: {}", mark, msg.trim())) + }) + .collect() +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn bug_todo_note() { + let content = "BUG: This is a bug.\nTODO: Fix this soon.\nNOTE: Interesting observation."; + assert_eq!( + extract_markers(content), + vec![ + "BUG: This is a bug.", + "TODO: Fix this soon.", + "NOTE: Interesting observation.", + ] + ); + } + + #[test] + fn mention_is_dropped() { + assert_eq!( + extract_markers("TODO (@bion): hi there"), + vec!["TODO: hi there"] + ); + } + + #[test] + fn empty_msg_skipped() { + // ": " is part of the pattern, so "TODO: \n" has an empty msg + assert!(extract_markers("TODO: \n").is_empty()); + // msg=" " (extra space) is truthy; strip() empties it -> "TODO: " + assert_eq!(extract_markers("TODO: \n"), vec!["TODO: "]); + assert!(extract_markers("TODO:\n").is_empty()); + } +} diff --git a/crates/tree_plus_core/src/extract/mod.rs b/crates/tree_plus_core/src/extract/mod.rs new file mode 100644 index 0000000..c3572ff --- /dev/null +++ b/crates/tree_plus_core/src/extract/mod.rs @@ -0,0 +1,277 @@ +//! Component extraction: dispatch a file to the right extractor. +//! +//! This is the Rust analog of the legacy `parse_file` (renamed: it does not +//! parse files generically, it extracts displayable component labels). +//! Dispatch order and special cases deliberately mirror +//! `tree_plus_src/parse_file.py` because legacy behavior depends on them. + +pub mod data; +pub mod markdown; +pub mod markers; +pub mod simple; +pub mod treesitter; + +use std::path::Path; + +/// Errors during extraction. Legacy behavior maps any extractor error to an +/// empty component list at the engine layer (including the marker pass). +#[derive(Debug, thiserror::Error)] +pub enum ExtractError { + #[error("io: {0}")] + Io(#[from] std::io::Error), + #[error("parse: {0}")] + Parse(String), +} + +pub type ExtractResult = Result, ExtractError>; + +const BINARY_CHECK_SIZE: usize = 1024; + +/// Legacy `is_binary_string`: any byte outside the text set means binary. +/// Text bytes: {7, 8, 9, 10, 12, 13, 27} | [0x20, 0xFF] - {0x7F}. +pub fn is_binary_bytes(data: &[u8]) -> bool { + data.iter().any(|&b| { + !(matches!(b, 7 | 8 | 9 | 10 | 12 | 13 | 27) || (0x20..=0xFF).contains(&b) && b != 0x7F) + }) +} + +/// Check the first KiB of a file (legacy `is_binary`). +pub fn is_binary(path: &Path) -> bool { + use std::io::Read; + let Ok(mut f) = std::fs::File::open(path) else { + return false; + }; + let mut buf = [0u8; BINARY_CHECK_SIZE]; + let mut filled = 0; + while filled < buf.len() { + match f.read(&mut buf[filled..]) { + Ok(0) => break, + Ok(n) => filled += n, + Err(_) => return false, + } + } + is_binary_bytes(&buf[..filled]) +} + +/// `os.path.splitext`-compatible split: returns (stem path, lowered ".ext"). +fn splitext_lower(path: &str) -> (String, String) { + let (dir, name) = match path.rfind('/') { + Some(i) => (&path[..=i], &path[i + 1..]), + None => ("", path), + }; + let trimmed = name.trim_start_matches('.'); + let n_leading = name.len() - trimmed.len(); + match trimmed.rfind('.') { + Some(i) => { + let split_at = n_leading + i; + ( + format!("{dir}{}", &name[..split_at]), + name[split_at..].to_lowercase(), + ) + } + None => (path.to_string(), String::new()), + } +} + +/// Read the first `n` lines like the legacy `read_file(n_lines=n)`: +/// lines keep their trailing newline and are joined with an extra `\n`. +/// Returns an empty string when the file has fewer than `n` lines +/// (legacy StopIteration behavior). +fn read_first_lines(content: &str, n: usize) -> String { + let mut lines = Vec::with_capacity(n); + let mut rest = content; + for _ in 0..n { + match rest.find('\n') { + Some(i) => { + lines.push(&rest[..=i]); + rest = &rest[i + 1..]; + } + None => { + if rest.is_empty() { + return String::new(); // StopIteration -> "" + } + lines.push(rest); + rest = ""; + } + } + } + lines.join("\n") +} + +/// JS-family extensions (legacy `JS_EXTENSIONS`). +const JS_EXTENSIONS: &[&str] = &[".js", ".jsx", ".ts", ".tsx"]; +/// C-family extensions (legacy `C_EXTENSIONS`). +const C_EXTENSIONS: &[&str] = &[".c", ".cpp", ".cc", ".h", ".cu", ".cuh", ".hpp"]; +const PYTHON_EXTENSIONS: &[&str] = &[".py", ".pyi"]; +const MARKDOWN_EXTENSIONS: &[&str] = &[".md", ".markdown", ".mdx", ".mdc"]; + +/// Extensions recognized by the legacy implementation but deferred in the +/// Rust port version 1. Files still get TODO/BUG/NOTE markers; component +/// extraction is tracked in docs/language-roadmap.md. +const DEFERRED_EXTENSIONS: &[&str] = &[ + ".php", ".kt", ".swift", ".go", ".sh", ".ps1", ".zig", ".rb", ".sql", ".graphql", ".cs", ".jl", + ".scala", ".java", ".pl", ".hs", ".fs", ".lisp", ".clj", ".scm", ".el", ".rkt", ".erl", ".hrl", + ".capnp", ".proto", ".tex", ".lean", ".f", ".for", ".f77", ".f90", ".f95", ".f03", ".f08", + ".tf", ".thy", ".lua", ".tcl", ".m", ".r", ".nb", ".wl", ".matlab", ".ml", ".cbl", ".cobol", + ".apl", ".metal", ".wgsl", ".html", +]; + +/// Whether this extension is deferred (legacy support, no Rust port yet). +pub fn is_deferred_extension(ext: &str) -> bool { + DEFERRED_EXTENSIONS.contains(&ext) +} + +/// Extract displayable component labels from a file. +/// +/// `syntax` mirrors the legacy flag: when false, Rust enum components are +/// rich-markup-escaped exactly like the legacy renderer expected. +pub fn extract_components(path: &Path, syntax: bool) -> Vec { + try_extract_components(path, syntax).unwrap_or_default() +} + +fn try_extract_components(path: &Path, syntax: bool) -> ExtractResult { + let path_str = path.to_string_lossy().replace('\\', "/"); + let (base_path, ext) = splitext_lower(&path_str); + let file_name = base_path + .rsplit('/') + .next() + .unwrap_or(&base_path) + .to_string(); + + // sqlite databases are handled before reading the file + if ext == ".db" || ext == ".sqlite" { + return data::extract_sqlite(path); + } + + if is_binary(path) { + return Ok(Vec::new()); + } + + let raw = std::fs::read(path)?; + let full_content = String::from_utf8_lossy(&raw).into_owned(); + // legacy read_file uses errors="strict": undecodable files read as "" + let full_content = if std::str::from_utf8(&raw).is_ok() { + full_content + } else { + String::new() + }; + + // big data files only read a few lines + let content: String = match ext.as_str() { + ".csv" => read_first_lines(&full_content, 3), + ".jsonl" => read_first_lines(&full_content, 2), + _ => full_content, + }; + + let components: Vec = match ext.as_str() { + e if JS_EXTENSIONS.contains(&e) => { + if path_str.ends_with(".d.ts") { + Vec::new() // legacy parse_d_dot_ts returns [] (it never appends) + } else { + let mut components = + treesitter::typescript::extract(&content, e == ".tsx" || e == ".jsx")?; + if path_str.contains("spec.ts") { + components = simple::angular_spec(&content); + } + if path_str.contains("app-routing.module") { + let mut routes = simple::angular_routes(&content); + routes.extend(components); + components = routes; + } else if path_str.contains("app.module") { + let mut module = simple::angular_app_module(&content); + module.extend(components); + components = module; + } else if file_name == "environment" || file_name.contains("environment.") { + components = simple::environment_ts(&content); + } + components + } + } + e if PYTHON_EXTENSIONS.contains(&e) => treesitter::python::extract(&content)?, + e if MARKDOWN_EXTENSIONS.contains(&e) => markdown::extract_md(&content), + ".rst" => markdown::extract_rst(&content), + ".json" => { + if path_str.to_lowercase().contains("package.json") { + data::extract_package_json(&content)? + } else if content.contains("$schema") { + data::extract_json_schema(&content)? + } else if content.contains("jsonrpc\": \"2") { + data::extract_json_rpc(&content)? + } else if content.contains("openrpc\": \"") { + data::extract_openrpc_json(&content)? + } else { + Vec::new() + } + } + ".yml" | ".yaml" => data::extract_yml(&content)?, + e if C_EXTENSIONS.contains(&e) => treesitter::c_cpp::extract(&content, e)?, + ".rs" => treesitter::rust::extract(&content, syntax)?, + ".jsonl" => data::extract_jsonl(&content)?, + ".env" => simple::dot_env(&content), + ".txt" => { + if file_name.contains("requirements") { + simple::requirements_txt(&content) + } else { + markdown::extract_txt(&content) + } + } + ".csv" => data::extract_csv(&content)?, + "" if file_name == "Makefile" || file_name == "Justfile" => simple::makefile(&content), + _ if file_name.starts_with(".env") => simple::dot_env(&content), + _ if path_str.ends_with("Cargo.toml") => data::extract_cargo_toml(&content)?, + _ if path_str.ends_with("pyproject.toml") => data::extract_pyproject_toml(&content)?, + _ => Vec::new(), + }; + + // markers skip exactly .txt and .md (legacy quirk: not .markdown/.mdx) + if ext == ".txt" || ext == ".md" { + Ok(components) + } else { + let mut total = markers::extract_markers(&content); + total.extend(components); + Ok(total) + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn binary_detection() { + assert!(is_binary_bytes(&[0x00, 0x01])); + assert!(!is_binary_bytes(b"hello world\n")); + assert!(!is_binary_bytes("héllo".as_bytes())); // UTF-8 high bytes are text + assert!(is_binary_bytes(&[0x7F])); + } + + #[test] + fn splitext_examples() { + assert_eq!( + splitext_lower("a/b/file.PY"), + ("a/b/file".to_string(), ".py".to_string()) + ); + assert_eq!( + splitext_lower(".env.test"), + (".env".to_string(), ".test".to_string()) + ); + assert_eq!( + splitext_lower("dir/.env"), + ("dir/.env".to_string(), String::new()) + ); + assert_eq!( + splitext_lower("Makefile"), + ("Makefile".to_string(), String::new()) + ); + } + + #[test] + fn first_lines_eof_means_empty() { + assert_eq!( + read_first_lines("a,b\nc,d\ne,f\ng\n", 3), + "a,b\n\nc,d\n\ne,f\n" + ); + assert_eq!(read_first_lines("a,b\n", 3), ""); + assert_eq!(read_first_lines("", 2), ""); + } +} diff --git a/crates/tree_plus_core/src/extract/simple.rs b/crates/tree_plus_core/src/extract/simple.rs new file mode 100644 index 0000000..38a00e1 --- /dev/null +++ b/crates/tree_plus_core/src/extract/simple.rs @@ -0,0 +1,216 @@ +//! Line-oriented extractors: .env, requirements.txt, Makefile/Justfile, +//! and the Angular special cases from the legacy TypeScript dispatch. + +use std::sync::LazyLock; + +use regex::Regex; + +static ENV_KEY_RE: LazyLock = LazyLock::new(|| Regex::new(r"([A-Z|_]*)=").unwrap()); + +/// Python-`str.splitlines()`-alike for `\n` / `\r\n` / `\r` (no trailing empty). +pub fn splitlines(content: &str) -> Vec<&str> { + let mut out = Vec::new(); + let mut start = 0; + let bytes = content.as_bytes(); + let mut i = 0; + while i < bytes.len() { + match bytes[i] { + b'\n' => { + out.push(&content[start..i]); + i += 1; + start = i; + } + b'\r' => { + out.push(&content[start..i]); + i += 1; + if i < bytes.len() && bytes[i] == b'\n' { + i += 1; + } + start = i; + } + _ => i += 1, + } + } + if start < bytes.len() { + out.push(&content[start..]); + } + out +} + +/// Legacy `parse_dot_env`: first `KEY=` match per non-comment line. +pub fn dot_env(content: &str) -> Vec { + splitlines(content) + .into_iter() + .filter(|line| !line.starts_with('#')) + .filter_map(|line| { + ENV_KEY_RE + .captures(line) + .and_then(|c| c.get(1)) + .map(|m| m.as_str().to_string()) + }) + .collect() +} + +/// Legacy `parse_requirements_txt`: every line not starting with `#`. +pub fn requirements_txt(content: &str) -> Vec { + splitlines(content) + .into_iter() + .filter(|line| !line.starts_with('#')) + .map(|line| line.to_string()) + .collect() +} + +/// Legacy `parse_makefile`: targets, includes, defines, .PHONY lines. +pub fn makefile(content: &str) -> Vec { + content + .split('\n') + .filter(|line| { + let stripped = line.trim(); + !stripped.is_empty() + && !line.starts_with('\t') + && !stripped.starts_with('#') + && (line.starts_with(".PHONY") + || line.contains(':') + || line.starts_with("include") + || line.starts_with("define")) + }) + .map(|line| line.trim().trim_end_matches(':').to_string()) + .collect() +} + +static DESCRIBE_RE: LazyLock = + LazyLock::new(|| Regex::new(r"(\t*| *)describe\('(.*)'").unwrap()); +static IT_RE: LazyLock = + LazyLock::new(|| Regex::new(r#"(\t*| *)it\(('|")(.*)('|")"#).unwrap()); + +/// Legacy `parse_angular_spec`: describe/it lines. +pub fn angular_spec(content: &str) -> Vec { + let mut components = Vec::new(); + for line in content.split('\n') { + if let Some(caps) = DESCRIBE_RE.captures(line) { + components.push(format!( + "{}describe '{}'", + caps.get(1).map_or("", |m| m.as_str()), + caps.get(2).map_or("", |m| m.as_str()) + )); + } else if let Some(caps) = IT_RE.captures(line) { + let statement = caps + .get(3) + .map_or("", |m| m.as_str()) + .replace('\\', "") + .replace('"', "'"); + components.push(format!( + "{}it {}", + caps.get(1).map_or("", |m| m.as_str()), + statement + )); + } + } + components +} + +static ROUTES_RE: LazyLock = + LazyLock::new(|| Regex::new(r"(?s)(const routes: Routes = \[\n.*?\];)").unwrap()); + +/// Legacy `parse_angular_routes`. +pub fn angular_routes(content: &str) -> Vec { + ROUTES_RE + .captures(content) + .and_then(|c| c.get(1)) + .map(|m| vec![m.as_str().to_string()]) + .unwrap_or_default() +} + +static NG_MODULE_RE: LazyLock = LazyLock::new(|| { + Regex::new(r"(@NgModule\(\{\n\s*declarations: \[((\n\s*)[a-zA-Z]*,?)*)\n").unwrap() +}); + +/// Legacy `parse_angular_app_module`. +pub fn angular_app_module(content: &str) -> Vec { + NG_MODULE_RE + .captures(content) + .and_then(|c| c.get(1)) + .map(|m| vec![m.as_str().to_string()]) + .unwrap_or_default() +} + +static ENV_TS_KEY_RE: LazyLock = LazyLock::new(|| Regex::new(r"(\w+):").unwrap()); + +/// Legacy `parse_environment_ts`: keys of `export const environment = {...}`. +pub fn environment_ts(content: &str) -> Vec { + let mut parsing = false; + let mut keepers: Vec = Vec::new(); + for line in content.split('\n') { + let stripped = line.trim(); + if stripped.starts_with("export const environment = {") { + parsing = true; + continue; + } + if stripped.starts_with("};") { + break; + } + if stripped.starts_with("//") || stripped.starts_with("/*") || stripped.starts_with('*') { + continue; + } + if parsing { + if let Some(caps) = ENV_TS_KEY_RE.captures(line) { + keepers.push(format!(" {}", caps.get(1).map_or("", |m| m.as_str()))); + } + } + } + if keepers.is_empty() { + keepers + } else { + let mut out = vec!["environment:".to_string()]; + out.extend(keepers); + out + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn env_keys() { + assert_eq!( + dot_env("API_KEY=abc\n# comment SECRET=no\nDB_URL=postgres\n"), + vec!["API_KEY", "DB_URL"] + ); + } + + #[test] + fn env_lowercase_key_yields_empty_match() { + // legacy quirk: ([A-Z|_]*) matches empty before '=' + assert_eq!(dot_env("lower=1\n"), vec![""]); + } + + #[test] + fn makefile_targets() { + let content = + "SHELL := /bin/bash\n.PHONY: test\ntest:\n\tpytest\n# comment:\ninclude foo.mk\n"; + assert_eq!( + makefile(content), + vec![ + "SHELL := /bin/bash", + ".PHONY: test", + "test", + "include foo.mk" + ] + ); + } + + #[test] + fn requirements_keeps_blank_lines() { + assert_eq!(requirements_txt("a==1\n# c\n\nb\n"), vec!["a==1", "", "b"]); + } + + #[test] + fn spec_describe_it() { + let content = "describe('AppComponent', () => {\n it('should work', () => {});\n});\n"; + assert_eq!( + angular_spec(content), + vec!["describe 'AppComponent'", " it should work"] + ); + } +} diff --git a/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs b/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs new file mode 100644 index 0000000..58a4839 --- /dev/null +++ b/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs @@ -0,0 +1,591 @@ +//! C and C++ component extraction (tree-sitter). +//! +//! Emits source-text slices in legacy `parse_c` style: +//! - function definitions (signature through initializer list, no body); +//! - struct/class/enum definitions with their fields and enumerators; +//! - access specifiers, typedef struct + `} name;` closers, #define macros, +//! template declarations. +//! +//! Intentional differences from the legacy regex (documented in +//! docs/rust-port-differences.md): control-flow noise (e.g. `while(...)` +//! lines), string contents, and the TensorFlow flag special case are not +//! emitted. + +use std::sync::LazyLock; + +use regex::Regex; +use tree_sitter::Node; + +use super::rust::strip_c_comments; +use super::{parse, ExtractResult}; + +/// Legacy "function" form: `^ *(modifier )?(return_type )(name)\(args\)` with +/// at most two words before the name and no `)` inside the args. +static FUNCTION_FORM_RE: LazyLock = LazyLock::new(|| { + Regex::new(r"^ *(?:[\w:]+ )?[\w:*&]+(?:\s?<[^>]*>\s?)? [\w*&\[\]]+\([^)]*\)$").unwrap() +}); + +/// Legacy "method" form prefix (before the parameter list): indented, +/// optional virtual/static, optional single-word return type, `~?[*\w]+`. +static METHOD_PREFIX_RE: LazyLock = + LazyLock::new(|| Regex::new(r"^\s+(?:virtual |static )?(?:\w+ )?~?[*\w]+$").unwrap()); + +/// Extract C/C++ components. `ext` picks the grammar (.c/.h use C unless C++ +/// constructs are likely; .cpp/.cc/.cu/.cuh/.hpp use C++). +pub fn extract(content: &str, ext: &str) -> ExtractResult { + let use_cpp = + matches!(ext, ".cpp" | ".cc" | ".cu" | ".cuh" | ".hpp") || looks_like_cpp(content); + let language: tree_sitter::Language = if use_cpp { + tree_sitter_cpp::LANGUAGE.into() + } else { + tree_sitter_c::LANGUAGE.into() + }; + let tree = parse(content, &language)?; + let mut extractor = CExtractor { + content, + components: Vec::new(), + suppress_plain_fields: 0, + }; + extractor.visit(tree.root_node()); + Ok(extractor.components) +} + +/// Cheap heuristic so C++ headers using .h still parse with the C++ grammar. +fn looks_like_cpp(content: &str) -> bool { + content.contains("class ") + || content.contains("namespace ") + || content.contains("template") + || content.contains("::") +} + +struct CExtractor<'a> { + content: &'a str, + components: Vec, + /// Legacy quirk: a `template<...> class/struct` line reset the regex + /// context, so plain data fields inside template records were skipped. + suppress_plain_fields: usize, +} + +fn line_start(content: &str, byte: usize) -> usize { + content[..byte].rfind('\n').map(|i| i + 1).unwrap_or(0) +} + +impl<'a> CExtractor<'a> { + fn slice_clean(&self, from: usize, to: usize) -> String { + // legacy parse_c ran on comment-stripped content + strip_c_comments(&self.content[from..to]) + .trim_end() + .trim_start_matches('\n') + .to_string() + } + + fn visit(&mut self, node: Node<'a>) { + match node.kind() { + "function_definition" => { + if let Some(body) = node.child_by_field_name("body") { + self.emit_function_definition(node, body); + self.visit(body); + } else if let Some(declarator) = find_function_declarator(node) { + // `= default;` / `= delete;` members have no body node + let start = line_start(self.content, node.start_byte()); + if let Some(component) = + self.method_form_component(start, node, declarator.end_byte()) + { + self.components.push(component); + } + } + } + "ERROR" => { + self.salvage_error_region(node); + } + "struct_specifier" | "class_specifier" | "union_specifier" => { + self.emit_record(node, None); + } + "enum_specifier" => { + self.emit_enum(node); + } + "type_definition" => { + self.emit_type_definition(node); + } + "declaration" => { + self.emit_declaration(node); + } + "field_declaration" => { + self.emit_field(node); + } + "access_specifier" => { + // includes the trailing ':' in the slice + let start = line_start(self.content, node.start_byte()); + let mut end = node.end_byte(); + if self.content[end..].starts_with(':') { + end += 1; + } + self.components.push(self.slice_clean(start, end)); + } + "preproc_def" | "preproc_function_def" => { + // legacy: `#define` + same-line invocation (greedy to the + // last `)` on the first line, excluding `\` continuations) + static DEFINE_RE: LazyLock = + LazyLock::new(|| Regex::new(r"^#define(\s\w+( ?\w* ?\(.*\))?)?").unwrap()); + let first_line = self.content[node.byte_range()] + .split('\n') + .next() + .unwrap_or(""); + if let Some(m) = DEFINE_RE.find(first_line) { + self.components.push(m.as_str().trim_end().to_string()); + } + } + "template_declaration" => { + self.emit_template(node); + } + "expression_statement" => { + // legacy pybind noise: ` *\w+.def("...` captured to first `;` + self.maybe_emit_pybind_def(node); + // legacy method-pattern noise: indented `asm(...)` statements + if let Some(child) = node.named_child(0) { + if child.kind().contains("asm") { + let start = line_start(self.content, node.start_byte()); + let indent = &self.content[start..node.start_byte()]; + if !indent.is_empty() && indent.chars().all(char::is_whitespace) { + let text = self.slice_clean(start, node.end_byte()); + self.components + .push(text.trim_end_matches(';').trim_end().to_string()); + } + } + } + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + self.visit(child); + } + } + "while_statement" | "for_statement" | "if_statement" | "switch_statement" => { + self.maybe_emit_control_noise(node); + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + self.visit(child); + } + } + _ => { + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + self.visit(child); + } + } + } + } + + /// struct/class/union with a body: label + fields (+ optional closer). + /// `prefix_start` overrides the slice start (e.g. `typedef `/`static `). + fn emit_record(&mut self, node: Node<'a>, prefix_start: Option) { + let Some(body) = node.child_by_field_name("body") else { + return; // bare type references are not components + }; + let start = prefix_start.unwrap_or_else(|| line_start(self.content, node.start_byte())); + let label = self.slice_clean(start, body.start_byte()); + if !label.is_empty() { + self.components.push(label); + } + self.visit(body); + } + + fn emit_enum(&mut self, node: Node<'a>) { + let Some(body) = node.child_by_field_name("body") else { + return; + }; + let start = line_start(self.content, node.start_byte()); + let label = self.slice_clean(start, body.start_byte()); + if !label.is_empty() { + self.components.push(label); + } + let mut cursor = body.walk(); + for child in body.named_children(&mut cursor) { + if child.kind() == "enumerator" { + let estart = line_start(self.content, child.start_byte()); + let mut eend = child.end_byte(); + if self.content[eend..].starts_with(',') { + eend += 1; + } + let text = strip_c_comments(&self.content[estart..eend]) + .trim_end() + .trim_start_matches('\n') + .replace('\t', " "); + self.components.push(text); + } + } + } + + /// `typedef struct {...} Name;` -> "typedef struct", fields, "} Name;". + fn emit_type_definition(&mut self, node: Node<'a>) { + let Some(type_node) = node.child_by_field_name("type") else { + return; + }; + match type_node.kind() { + "struct_specifier" | "union_specifier" | "class_specifier" => { + if type_node.child_by_field_name("body").is_some() { + self.emit_record(type_node, Some(node.start_byte())); + self.emit_closer(type_node, node); + } + } + "enum_specifier" => { + if type_node.child_by_field_name("body").is_some() { + self.emit_enum(type_node); + self.emit_closer(type_node, node); + } + } + _ => {} + } + } + + /// `static struct config {...} config;` and similar declarations. + fn emit_declaration(&mut self, node: Node<'a>) { + let Some(type_node) = node.child_by_field_name("type") else { + return; + }; + // legacy `other_static`: `^static (struct )?TYPE name([])?(?= =)` + if type_node.child_by_field_name("body").is_none() { + static OTHER_STATIC_RE: LazyLock = + LazyLock::new(|| Regex::new(r"^static (?:struct )?\w+ \w+(?:\[\])?").unwrap()); + let text = &self.content[node.byte_range()]; + if self.content[..node.start_byte()].ends_with('\n') { + if let Some(m) = OTHER_STATIC_RE.find(text) { + if text[m.end()..].starts_with(" =") { + self.components.push(m.as_str().to_string()); + return; + } + } + } + } + match type_node.kind() { + "struct_specifier" | "union_specifier" | "class_specifier" => { + if type_node.child_by_field_name("body").is_some() { + let start = line_start(self.content, node.start_byte()); + self.emit_record(type_node, Some(start)); + self.emit_closer(type_node, node); + } + } + "enum_specifier" => { + if type_node.child_by_field_name("body").is_some() { + self.emit_enum(type_node); + self.emit_closer(type_node, node); + } + } + _ => {} + } + } + + /// The legacy `} name;` closing component for named record declarations. + fn emit_closer(&mut self, type_node: Node<'a>, decl: Node<'a>) { + let Some(body) = type_node.child_by_field_name("body") else { + return; + }; + // only when a declarator follows the body on the closing line + if decl.end_byte() <= body.end_byte() { + return; + } + let closer_text = self.content[body.end_byte() - 1..decl.end_byte()].trim_end(); + if closer_text.len() <= 2 { + return; // "};" alone -> legacy emitted nothing + } + let start = line_start(self.content, body.end_byte() - 1); + self.components + .push(self.slice_clean(start, decl.end_byte())); + } + + /// Function definitions must satisfy one of the two legacy forms. + fn emit_function_definition(&mut self, node: Node<'a>, body: Node<'a>) { + let start = line_start(self.content, node.start_byte()); + let candidate = self.slice_clean(start, body.start_byte()); + if candidate.is_empty() { + return; + } + // legacy pybind11 module pattern + if candidate.starts_with("PYBIND11_MODULE") { + self.components.push(candidate); + return; + } + // function form: the `{` follows the `)` after exactly one whitespace + if FUNCTION_FORM_RE.is_match(&candidate) + && single_whitespace_before(self.content, body.start_byte()) + { + self.components.push(candidate); + return; + } + // method form: indented members (with const/override/initializers) + if let Some(component) = self.method_form_component(start, node, body.start_byte()) { + self.components.push(component); + } + } + + /// Validate the legacy method form; returns the trimmed component. + fn method_form_component(&self, start: usize, node: Node<'a>, end: usize) -> Option { + // find the parameter list of the declarator + let declarator = find_function_declarator(node)?; + let params = declarator.child_by_field_name("parameters")?; + let prefix = strip_c_comments(&self.content[start..params.start_byte()]); + if !METHOD_PREFIX_RE.is_match(&prefix) { + return None; + } + // legacy args `\(.*\)` were single-line + if self.content[params.byte_range()].contains('\n') { + return None; + } + let text = self.slice_clean(start, end); + Some(text.trim_end_matches(';').trim_end().to_string()) + } + + fn emit_field(&mut self, node: Node<'a>) { + // legacy field patterns require an indented member starting its line + let fstart = line_start(self.content, node.start_byte()); + let indent = &self.content[fstart..node.start_byte()]; + if indent.is_empty() || !indent.chars().all(char::is_whitespace) { + return; + } + // nested records/enums inside the field (anonymous struct members) + if let Some(type_node) = node.child_by_field_name("type") { + if matches!( + type_node.kind(), + "struct_specifier" | "union_specifier" | "enum_specifier" | "class_specifier" + ) && type_node.child_by_field_name("body").is_some() + { + self.visit(type_node); + self.emit_closer(type_node, node); // e.g. `} inner;` + return; + } + } + if node_has_function_declarator(node) { + // member function declarations follow the legacy method form; + // trailing `= 0` / `= default` / `= delete` were never captured + if let Some(declarator) = find_function_declarator(node) { + if let Some(component) = + self.method_form_component(fstart, node, declarator.end_byte()) + { + self.components.push(component); + } + } + return; + } + if self.suppress_plain_fields > 0 { + return; + } + let start = fstart; + let text = strip_c_comments(&self.content[start..node.end_byte()]) + .trim_end() + .trim_start_matches('\n') + .replace('\t', " "); + let trimmed = text.trim_end_matches(' ').to_string(); + // legacy fields had to terminate with `;` (struct) or `,` (enum) + if !(trimmed.ends_with(';') || trimmed.ends_with(',')) { + return; + } + self.components.push(trimmed); + } + + /// Partial recovery inside ERROR regions: invalid syntax must still + /// yield the obvious components instead of nothing. + fn salvage_error_region(&mut self, node: Node<'a>) { + static RECORD_HEADER_RE: LazyLock = + LazyLock::new(|| Regex::new(r"^[ \t]*((?:class|struct) \w+[^\n{;]*)").unwrap()); + let text = &self.content[node.byte_range()]; + if let Some(caps) = RECORD_HEADER_RE.captures(text) { + if let Some(m) = caps.get(1) { + self.components.push(m.as_str().trim_end().to_string()); + } + } + // ordered pass: salvage loose function declarators, recurse the rest + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + if child.kind() == "function_declarator" { + let start = line_start(self.content, child.start_byte()); + let candidate = self.slice_clean(start, child.end_byte()); + if FUNCTION_FORM_RE.is_match(&candidate) { + let after = &self.content[child.end_byte()..]; + let mut chars = after.chars(); + if chars.next().is_some_and(char::is_whitespace) && chars.next() == Some('{') { + self.components.push(candidate); + } + } + } else { + self.visit(child); + } + } + } + + /// Legacy pybind pattern: `^ *\w+\.def\("` ... up to the first `;`. + fn maybe_emit_pybind_def(&mut self, stmt: Node<'a>) { + let start = line_start(self.content, stmt.start_byte()); + let region = &self.content[start..]; + static DEF_RE: LazyLock = LazyLock::new(|| Regex::new(r#"^ *\w+\.def\(""#).unwrap()); + if !DEF_RE.is_match(region) { + return; + } + let Some(semi) = region.find(';') else { return }; + let text = strip_c_comments(®ion[..semi]) + .trim_end() + .trim_start_matches('\n') + .to_string(); + self.components.push(text); + } + + /// Legacy method-pattern noise: an indented `keyword(...)` with no space + /// before the paren (e.g. `while((ln = listNext(&li)))`) was emitted. + fn maybe_emit_control_noise(&mut self, node: Node<'a>) { + let Some(cond) = node + .child_by_field_name("condition") + .or_else(|| node.child_by_field_name("initializer")) + else { + return; + }; + let kw = node.start_byte(); + let start = line_start(self.content, kw); + let indent = &self.content[start..kw]; + if indent.is_empty() || !indent.chars().all(|c| c == ' ' || c == '\t') { + return; + } + // the legacy pattern required no whitespace between keyword and `(` + let kw_end = kw + node.kind().split('_').next().unwrap_or("").len(); + if !self.content[kw_end..].starts_with('(') { + return; + } + if self.content[cond.byte_range()].contains('\n') { + return; + } + // fin: a following `\s{` (compound body) -- the common legacy case + let after = &self.content[cond.end_byte()..]; + if !(after.starts_with(" {") || after.starts_with("\n")) { + return; + } + self.components + .push(self.slice_clean(start, cond.end_byte()).replace('\t', " ")); + } + + fn emit_template(&mut self, node: Node<'a>) { + // slice from `template` through the inner declaration's signature + let start = line_start(self.content, node.start_byte()); + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + match child.kind() { + "function_definition" => { + if let Some(body) = child.child_by_field_name("body") { + let text = self.slice_clean(start, body.start_byte()); + self.components.push(text); + self.visit(body); + } + return; + } + "declaration" => { + // prototype: `template T cos(T);` + // (`= delete` etc. were never captured by the legacy regex) + let end = find_function_declarator(child) + .map(|d| d.end_byte()) + .unwrap_or_else(|| child.end_byte()); + let text = self.slice_clean(start, end); + self.components + .push(text.trim_end_matches(';').trim_end().to_string()); + return; + } + "struct_specifier" | "class_specifier" => { + if let Some(body) = child.child_by_field_name("body") { + let text = self.slice_clean(start, body.start_byte()); + self.components.push(text); + self.suppress_plain_fields += 1; + self.visit(body); + self.suppress_plain_fields -= 1; + } else { + let text = self.slice_clean(start, child.end_byte()); + self.components.push(text); + } + return; + } + "alias_declaration" | "type_definition" => { + let text = self.slice_clean(start, child.end_byte()); + self.components + .push(text.trim_end_matches(';').trim_end().to_string()); + return; + } + _ => {} + } + } + } +} + +fn node_has_function_declarator(node: Node<'_>) -> bool { + find_function_declarator(node).is_some() +} + +/// Walk the declarator chain to the `function_declarator`, if any. +fn find_function_declarator(node: Node<'_>) -> Option> { + let mut current = node.child_by_field_name("declarator"); + while let Some(d) = current { + match d.kind() { + "function_declarator" => return Some(d), + _ => current = d.child_by_field_name("declarator"), + } + } + None +} + +/// Legacy `(?=\s\{)`: exactly one whitespace char before the brace. +fn single_whitespace_before(content: &str, brace: usize) -> bool { + let before = &content[..brace]; + before + .chars() + .next_back() + .is_some_and(|c| c.is_whitespace()) + && !before.ends_with(" ") + && !before.ends_with("\n\n") + && !before.ends_with(" \n") + && !before.ends_with("\n ") +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn c_struct_and_functions() { + let content = "struct Point {\n int x;\n int y;\n};\n\nstruct Point getOrigin() {\n struct Point o;\n return o;\n}\n"; + assert_eq!( + extract(content, ".c").unwrap(), + vec![ + "struct Point", + " int x;", + " int y;", + "struct Point getOrigin()" + ] + ); + } + + #[test] + fn typedef_struct_with_closer() { + let content = "typedef struct {\n char name[50];\n} Person;\n"; + assert_eq!( + extract(content, ".c").unwrap(), + vec!["typedef struct", " char name[50];", "} Person;"] + ); + } + + #[test] + fn cpp_class_with_access() { + let content = "class Animal {\npublic:\n Animal(const std::string &name) : name(name) {}\n virtual void speak() const {}\nprotected:\n std::string name;\n};\n"; + assert_eq!( + extract(content, ".cpp").unwrap(), + vec![ + "class Animal", + "public:", + " Animal(const std::string &name) : name(name)", + " virtual void speak() const", + "protected:", + " std::string name;", + ] + ); + } + + #[test] + fn enums_with_commas() { + let content = "enum days {\n SUN,\n MON,\n SAT\n};\n"; + assert_eq!( + extract(content, ".c").unwrap(), + vec!["enum days", " SUN,", " MON,", " SAT"] + ); + } +} diff --git a/crates/tree_plus_core/src/extract/treesitter/mod.rs b/crates/tree_plus_core/src/extract/treesitter/mod.rs new file mode 100644 index 0000000..652ad05 --- /dev/null +++ b/crates/tree_plus_core/src/extract/treesitter/mod.rs @@ -0,0 +1,66 @@ +//! Tree-sitter based extractors for the big version-1 languages: +//! Rust, Python, JavaScript/TypeScript, C and C++. +//! +//! Formatters emit source-text signature slices so output matches the legacy +//! regex-based components; golden parity tests in tests/ enforce this. + +pub mod c_cpp; +pub mod python; +pub mod rust; +pub mod typescript; + +use std::cell::RefCell; + +use tree_sitter::{Language, Node, Parser, Tree}; + +use super::{ExtractError, ExtractResult}; + +thread_local! { + /// One parser per thread (tree-sitter parser state is mutable). + static PARSER: RefCell = RefCell::new(Parser::new()); +} + +/// Parse `content` with `language`, reusing a thread-local parser. +pub fn parse(content: &str, language: &Language) -> Result { + PARSER.with(|cell| { + let mut parser = cell.borrow_mut(); + parser + .set_language(language) + .map_err(|e| ExtractError::Parse(e.to_string()))?; + parser + .parse(content, None) + .ok_or_else(|| ExtractError::Parse("tree-sitter returned no tree".to_string())) + }) +} + +/// UTF-8 text of a node (lossy-safe since content is valid UTF-8). +pub fn node_text<'a>(node: Node<'_>, content: &'a str) -> &'a str { + &content[node.byte_range()] +} + +/// Walk all nodes depth-first, calling `visit` on each. +pub fn walk_tree<'t, F: FnMut(Node<'t>)>(tree: &'t Tree, mut visit: F) { + let mut cursor = tree.walk(); + let mut reached_root = false; + while !reached_root { + visit(cursor.node()); + if cursor.goto_first_child() { + continue; + } + loop { + if cursor.goto_next_sibling() { + break; + } + if !cursor.goto_parent() { + reached_root = true; + break; + } + } + } +} + +/// Used by extractors that are still stubs. +#[allow(dead_code)] +pub fn unimplemented_language() -> ExtractResult { + Ok(Vec::new()) +} diff --git a/crates/tree_plus_core/src/extract/treesitter/python.rs b/crates/tree_plus_core/src/extract/treesitter/python.rs new file mode 100644 index 0000000..57c39fe --- /dev/null +++ b/crates/tree_plus_core/src/extract/treesitter/python.rs @@ -0,0 +1,346 @@ +//! Python component extraction (tree-sitter), matching legacy `parse_py`. +//! +//! The legacy extractor was a single multi-pattern regex over +//! comment-and-docstring-stripped source. The formatter here reproduces its +//! observable quirks deliberately, e.g.: +//! - `async def` is never emitted (the legacy pattern required `^ *def`); +//! - decorators are carried until the next matching def/class and dropped +//! only when consumed; +//! - ALL-CAPS assignments ("enum variants") match anywhere, with int, double +//! quoted string, or multiline constructor values; +//! - annotated fields are emitted only while the most recent component was a +//! class ("class context"), which persists past the class body. + +use std::sync::LazyLock; + +use regex::Regex; +use tree_sitter::Node; + +use super::{node_text, parse, ExtractResult}; + +static COMMENT_RE: LazyLock = LazyLock::new(|| Regex::new(r"\s*#.*\n").unwrap()); +static DECORATOR_RE: LazyLock = LazyLock::new(|| Regex::new(r"^@\w+(\(.*\))?$").unwrap()); +static RETURN_TYPE_RE: LazyLock = + LazyLock::new(|| Regex::new(r#"^[\w"'\[\],. ]+$"#).unwrap()); +static SUPERCLASSES_RE: LazyLock = + LazyLock::new(|| Regex::new(r"^\([\w\[\]\s,=.]*\)$").unwrap()); +static VERSION_RE: LazyLock = + LazyLock::new(|| Regex::new(r#"^__version__ = ".*""#).unwrap()); +static TYPEVAR_RE: LazyLock = + LazyLock::new(|| Regex::new(r"^\w+ = TypeVar\([^)]+\)").unwrap()); +static INT_VALUE_RE: LazyLock = LazyLock::new(|| Regex::new(r"^\d+").unwrap()); +static STR_VALUE_RE: LazyLock = LazyLock::new(|| Regex::new(r#"^"+[^"]+"+"#).unwrap()); +static STRUCT_CLOSING_RE: LazyLock = LazyLock::new(|| Regex::new(r"(?m)^\s*\)$").unwrap()); +static FIELD_RE: LazyLock = + LazyLock::new(|| Regex::new(r"^\s+\w+:\s+[\w\[\]|., ]+").unwrap()); +static ALLCAPS_RE: LazyLock = LazyLock::new(|| Regex::new(r"^[A-Z_\d]+$").unwrap()); + +/// Remove `# ...` comments the way the legacy extractor did. +fn strip_comments(slice: &str) -> String { + COMMENT_RE.replace_all(slice, "\n").into_owned() +} + +struct PyExtractor<'a> { + content: &'a str, + components: Vec, + pending_decorators: Vec, + in_class: bool, +} + +/// Extract Python components: defs, classes, decorators, TypeVars, +/// `__version__`, enum variants, dataclass fields. +pub fn extract(content: &str) -> ExtractResult { + let tree = parse(content, &tree_sitter_python::LANGUAGE.into())?; + let mut extractor = PyExtractor { + content, + components: Vec::new(), + pending_decorators: Vec::new(), + in_class: false, + }; + extractor.visit(tree.root_node()); + Ok(extractor.components) +} + +impl<'a> PyExtractor<'a> { + /// Byte offset of the start of the run of spaces before `start` + /// (legacy patterns matched ` *` indentation, spaces only). + fn space_indent_start(&self, start: usize) -> usize { + let bytes = self.content.as_bytes(); + let mut i = start; + while i > 0 && bytes[i - 1] == b' ' { + i -= 1; + } + i + } + + /// Byte offset of the start of `[ \t]*` indentation before `start`. + fn ws_indent_start(&self, start: usize) -> usize { + let bytes = self.content.as_bytes(); + let mut i = start; + while i > 0 && (bytes[i - 1] == b' ' || bytes[i - 1] == b'\t') { + i -= 1; + } + i + } + + fn is_line_start(&self, offset: usize) -> bool { + offset == 0 || self.content.as_bytes()[offset - 1] == b'\n' + } + + fn emit_with_decorators(&mut self, component: String) { + if self.pending_decorators.is_empty() { + self.components.push(component); + } else { + let mut joined = self.pending_decorators.join("\n"); + joined.push('\n'); + joined.push_str(&component); + self.pending_decorators.clear(); + self.components.push(joined); + } + } + + fn visit(&mut self, node: Node<'a>) { + match node.kind() { + "decorated_definition" => { + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + if child.kind() == "decorator" { + self.collect_decorator(child); + } else { + self.visit(child); + } + } + } + "function_definition" => self.handle_function(node), + "class_definition" => self.handle_class(node), + "expression_statement" => { + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + if child.kind() == "assignment" { + self.handle_assignment(child); + } + } + } + // expressions cannot contain statements; recursion elsewhere is safe + _ => { + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + self.visit(child); + } + } + } + } + + fn collect_decorator(&mut self, node: Node<'a>) { + // legacy: ` *@\w+(\(.*\))?\n` -- single line, simple name, no dots + let text = node_text(node, self.content); + if node.start_position().row != node.end_position().row { + return; + } + // the decorator must end its line (only whitespace/comment was removed) + let after = &self.content[node.end_byte()..]; + let line_rest = after.split('\n').next().unwrap_or(""); + if !line_rest.trim_start_matches([' ', '\t']).is_empty() + && !line_rest.trim_start().starts_with('#') + { + return; + } + if !DECORATOR_RE.is_match(text) { + return; + } + let indent_start = self.space_indent_start(node.start_byte()); + self.pending_decorators + .push(self.content[indent_start..node.end_byte()].to_string()); + } + + fn handle_function(&mut self, node: Node<'a>) { + let body = node.child_by_field_name("body"); + let is_async = node.child(0).map(|c| c.kind() == "async").unwrap_or(false); + if !is_async { + if let Some(component) = self.build_function_signature(node) { + self.emit_with_decorators(component); + self.in_class = false; + } + } + if let Some(body) = body { + self.visit(body); + } + } + + fn build_function_signature(&self, node: Node<'a>) -> Option { + let params = node.child_by_field_name("parameters")?; + // legacy params pattern allowed nesting depth <= 2 inside the parens + let params_text = node_text(params, self.content); + let mut depth: i32 = 0; + let mut max_inner: i32 = 0; + for ch in params_text.chars() { + match ch { + '(' => { + depth += 1; + max_inner = max_inner.max(depth - 1); + } + ')' => depth -= 1, + _ => {} + } + } + if max_inner > 2 { + return None; + } + let end = match node.child_by_field_name("return_type") { + Some(ret) => { + let ret_text = node_text(ret, self.content); + if !RETURN_TYPE_RE.is_match(&strip_comments(ret_text)) { + return None; // legacy pattern would fail entirely + } + ret.end_byte() + } + None => params.end_byte(), + }; + // find the `def` keyword start (node start may equal it) + let def_kw = node + .children(&mut node.walk()) + .find(|c| c.kind() == "def") + .map(|c| c.start_byte()) + .unwrap_or(node.start_byte()); + let start = self.space_indent_start(def_kw); + let slice = &self.content[start..end]; + let cleaned = strip_comments(slice); + Some(cleaned) + } + + fn handle_class(&mut self, node: Node<'a>) { + let body = node.child_by_field_name("body"); + if let Some(component) = self.build_class_signature(node) { + self.emit_with_decorators(component); + self.in_class = true; + } + if let Some(body) = body { + self.visit(body); + } + } + + fn build_class_signature(&self, node: Node<'a>) -> Option { + let name = node.child_by_field_name("name")?; + let mut end = name.end_byte(); + if let Some(tp) = node.child_by_field_name("type_parameters") { + end = tp.end_byte(); + } + if let Some(supers) = node.child_by_field_name("superclasses") { + let text = node_text(supers, self.content); + if !SUPERCLASSES_RE.is_match(text) { + return None; // legacy charset would fail to match + } + end = supers.end_byte(); + } + let start = self.space_indent_start(node.start_byte()); + Some(self.content[start..end].to_string()) + } + + fn handle_assignment(&mut self, node: Node<'a>) { + let Some(left) = node.child_by_field_name("left") else { + return; + }; + if left.kind() != "identifier" { + return; + } + let left_text = node_text(left, self.content); + let stmt_start = left.start_byte(); + let indent_start = self.ws_indent_start(stmt_start); + let at_col0 = indent_start == stmt_start && self.is_line_start(stmt_start); + let line_region = &self.content[stmt_start..]; + + if at_col0 { + // legacy version/typevar patterns require a preceding newline; + // a statement at byte 0 of the file can never match + if stmt_start == 0 { + return; + } + if left_text == "__version__" { + if let Some(m) = VERSION_RE.find(line_region.split('\n').next().unwrap_or("")) { + self.components.push(m.as_str().to_string()); + self.in_class = false; + } + } else if let Some(m) = TYPEVAR_RE.find(line_region) { + self.components.push(m.as_str().to_string()); + self.in_class = false; + } + return; + } + + // indented assignments: enum variants first (legacy alternation order) + let type_annotation = node.child_by_field_name("type"); + let right = node.child_by_field_name("right"); + if ALLCAPS_RE.is_match(left_text) && type_annotation.is_none() { + if let Some(right) = right { + let value_start = right.start_byte(); + let value_text = &self.content[value_start..]; + // only `NAME = value` with whitespace-separated `=` matches + let between = &self.content[left.end_byte()..value_start]; + let eq_ok = { + let trimmed = between.trim(); + trimmed == "=" + && between.starts_with(char::is_whitespace) + && between.ends_with(char::is_whitespace) + }; + if eq_ok { + if let Some(m) = INT_VALUE_RE.find(value_text) { + let end = value_start + m.end(); + self.components + .push(self.content[indent_start..end].to_string()); + return; + } + if let Some(m) = STR_VALUE_RE.find(value_text) { + let end = value_start + m.end(); + self.components + .push(self.content[indent_start..end].to_string()); + return; + } + // multiline constructor value: [A-Z]\w*( ... ^\s+\)$ + if right.kind() == "call" { + let func_ok = right + .child_by_field_name("function") + .map(|f| { + f.kind() == "identifier" + && node_text(f, self.content) + .starts_with(|c: char| c.is_ascii_uppercase()) + }) + .unwrap_or(false); + if func_ok { + let call_region = &self.content[value_start..right.end_byte()]; + if let Some(m) = STRUCT_CLOSING_RE.find(call_region) { + let end = value_start + m.end(); + self.components + .push(self.content[indent_start..end].to_string()); + return; + } + } + } + } + } + } + + // dataclass fields only in class context, with a type annotation + if self.in_class && type_annotation.is_some() { + let line_start = self.ws_indent_start(stmt_start); + if !self.is_line_start(line_start) { + return; + } + let region = &self.content[line_start..]; + let with_indent = format!( + "{}{}", + &self.content[line_start..stmt_start], + region[stmt_start - line_start..] + .split('\n') + .next() + .unwrap_or("") + ); + // legacy: `^\s+\w+:\s+[\w\[\]\|\., ]+` + let padded = format!(" {with_indent}"); // \s+ needs >= 1 char + if let Some(m) = FIELD_RE.find(&padded) { + let captured = &padded[1..m.end()]; + let component = captured.trim_start_matches('\n').trim_end_matches(' '); + self.components.push(component.to_string()); + } + } + } +} diff --git a/crates/tree_plus_core/src/extract/treesitter/rust.rs b/crates/tree_plus_core/src/extract/treesitter/rust.rs new file mode 100644 index 0000000..3befc1d --- /dev/null +++ b/crates/tree_plus_core/src/extract/treesitter/rust.rs @@ -0,0 +1,312 @@ +//! Rust component extraction (tree-sitter), matching legacy `parse_rs`. +//! +//! Tree-sitter locates items (functions, structs, impls, enums, traits, +//! mods, macros); each item is then formatted by a direct port of the legacy +//! per-construct regex applied to the comment-stripped item slice, so the +//! emitted labels keep every legacy quirk (e.g. indented `impl` blocks are +//! skipped, enums require trailing commas on all variants, `pub(crate)` +//! visibility never matches). + +use std::sync::LazyLock; + +use regex::Regex; +use tree_sitter::Node; + +use super::{parse, ExtractResult}; + +/// Port of legacy `remove_c_comments` (note: `/* */` only on one line). +static C_COMMENT_RE: LazyLock = + LazyLock::new(|| Regex::new(r"(?m)(\s)*//.*(\s*,\s*|\s*\))?$|\s*/\*.*?\*/").unwrap()); + +pub fn strip_c_comments(content: &str) -> String { + C_COMMENT_RE.replace_all(content, "").into_owned() +} + +/// Prefix of the legacy function pattern, up to the opening paren. The +/// argument scan (with its `\)(?!:)` lookahead) is emulated manually in +/// `format_fn` because the regex crate has no look-around. +static FN_PREFIX_RE: LazyLock = LazyLock::new(|| { + Regex::new(r"^\s*(pub\s+?)?((?:async|const)\s+)?fn\s+\w+(<[^>]*?>)?\(").unwrap() +}); + +static FN_END_RE: LazyLock = LazyLock::new(|| Regex::new(r";\s|\{").unwrap()); + +/// Legacy fn-arguments character set: `[&\w,.':\[\]()<>${}/\s]`. +fn fn_args_char_ok(c: char) -> bool { + c.is_alphanumeric() + || c == '_' + || c.is_whitespace() + || matches!( + c, + '&' | ',' + | '.' + | '\'' + | ':' + | '[' + | ']' + | '(' + | ')' + | '<' + | '>' + | '$' + | '{' + | '}' + | '/' + ) +} + +static STRUCT_IMPL_RE: LazyLock = LazyLock::new(|| { + Regex::new(r"\n(?P(?: *((?:pub\s+)?struct)|impl)[^\{;]*?) ?[\{;]").unwrap() +}); + +static ENUM_RE: LazyLock = LazyLock::new(|| { + Regex::new( + r"(?m)^\s*(?P(?Ppub\s+?)?enum (?P\w+)(?P<[^>]*?>)? \{(?P(?P\s+#\[.*\]+?)?\s+(?P\w+)(?P(?P\([\s\S]+?\))|(?P \{[^}]+\}))?,)*\s\})\s", + ) + .unwrap() +}); + +static TRAIT_MOD_RE: LazyLock = LazyLock::new(|| { + Regex::new(r"\n(?P *(?:pub\s+)?(trait|mod)\s+\w*(<[^\{]*>)?)").unwrap() +}); + +static MACRO_RE: LazyLock = LazyLock::new(|| { + Regex::new(r"\n(?P(#\[macro_export\]\n)?macro_rules!\s+[a-z_][a-z_0-9]*)").unwrap() +}); + +/// Port of rich's `escape` (legacy applied it to enum components containing +/// attribute "decorators" when syntax highlighting was off). +static ESCAPE_RE: LazyLock = + LazyLock::new(|| Regex::new(r"(\\*)(\[[a-z#/@][^\[]*?\])").unwrap()); + +pub fn rich_escape(markup: &str) -> String { + let escaped = ESCAPE_RE.replace_all(markup, |caps: ®ex::Captures<'_>| { + format!("{0}{0}\\{1}", &caps[1], &caps[2]) + }); + let escaped = escaped.into_owned(); + if escaped.ends_with('\\') && !escaped.ends_with("\\\\") { + format!("{escaped}\\") + } else { + escaped + } +} + +/// Extract Rust components: fns, structs, impls, enums (with variants), +/// traits, mods, macro_rules. +pub fn extract(content: &str, syntax: bool) -> ExtractResult { + let tree = parse(content, &tree_sitter_rust::LANGUAGE.into())?; + let mut components: Vec = Vec::new(); + visit(tree.root_node(), content, syntax, &mut components); + Ok(components) +} + +fn visit(node: Node<'_>, content: &str, syntax: bool, out: &mut Vec) { + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + match child.kind() { + "function_item" | "function_signature_item" => { + if let Some(c) = format_fn(child, content) { + out.push(c); + } + // nested items inside fn bodies still match + visit(child, content, syntax, out); + } + "struct_item" => { + if let Some(c) = format_struct_impl(child, content, false) { + out.push(c); + } + } + "impl_item" => { + if let Some(c) = format_struct_impl(child, content, true) { + out.push(c); + } + visit(child, content, syntax, out); + } + "enum_item" => { + if let Some(c) = format_enum(child, content, syntax) { + out.push(c); + } + } + "trait_item" => { + if let Some(c) = format_trait_mod(child, content) { + out.push(c); + } + visit(child, content, syntax, out); + } + "mod_item" => { + if let Some(c) = format_trait_mod(child, content) { + out.push(c); + } + visit(child, content, syntax, out); + } + "macro_definition" => { + if let Some(c) = format_macro(child, content) { + out.push(c); + } + } + _ => visit(child, content, syntax, out), + } + } +} + +/// Start of the line containing `byte` (offset just after the previous `\n`). +fn line_start(content: &str, byte: usize) -> usize { + content[..byte].rfind('\n').map(|i| i + 1).unwrap_or(0) +} + +fn format_fn(node: Node<'_>, content: &str) -> Option { + node.child_by_field_name("parameters")?; + let start = line_start(content, node.start_byte()); + // include one byte past the node so `;\s` can see the newline + let end = (node.end_byte() + 1).min(content.len()); + let mut slice = strip_c_comments(&content[start..end]); + if !slice.ends_with(['\n', ' ', '{']) { + slice.push('\n'); // give `;\s` a whitespace to match at EOF + } + let open = FN_PREFIX_RE.find(&slice)?.end(); // offset just past `(` + // emulate `(?Pcharset+?)?\)(?!:)`: accept the first `)` whose + // preceding argument text fits the charset and which `:` does not follow + let chars: Vec<(usize, char)> = slice[open..].char_indices().collect(); + let mut args_ok = true; + let mut close: Option = None; // byte offset in `slice` of `)` + for (i, (off, c)) in chars.iter().enumerate() { + if *c == ')' { + let next = chars.get(i + 1).map(|(_, n)| *n); + if next != Some(':') { + close = Some(open + off); + break; + } + // `)` followed by `:` -> keep scanning, `)` stays in arguments + } + if !fn_args_char_ok(*c) { + args_ok = false; + break; + } + } + if !args_ok { + return None; + } + let close = close?; + let end_match = FN_END_RE.find(&slice[close + 1..])?; + let component = &slice[..close + 1 + end_match.start()]; + let component = component + .trim_end() + .trim_end_matches(',') + .trim_end_matches('\n') + .trim_end_matches(';'); + Some(component.trim_start_matches('\n').to_string()) +} + +fn format_struct_impl(node: Node<'_>, content: &str, is_impl: bool) -> Option { + let start = line_start(content, node.start_byte()); + if start == 0 { + return None; // legacy pattern required a preceding newline + } + if is_impl && node.start_byte() != start { + return None; // legacy: impl only matches at column 0 + } + let end = node.end_byte().min(content.len()); + // prepend the newline the legacy pattern consumed + let slice = format!("\n{}", strip_c_comments(&content[start..end])); + let caps = STRUCT_IMPL_RE.captures(&slice)?; + if caps.get(0)?.start() != 0 { + return None; + } + let component = caps.name("struct_impl")?.as_str().trim_end(); + Some(component.trim_start_matches('\n').to_string()) +} + +fn format_enum(node: Node<'_>, content: &str, syntax: bool) -> Option { + // attributes (e.g. #[derive]) come before the enum keyword inside the + // node? No: attribute_item is a sibling. The enum slice starts at the + // `pub`/`enum` keyword. + let start = line_start(content, node.start_byte()); + let end = (node.end_byte() + 1).min(content.len()); + let mut slice = strip_c_comments(&content[start..end]); + if !slice.ends_with(char::is_whitespace) { + slice.push('\n'); // trailing `\s` in the legacy pattern + } + let caps = ENUM_RE.captures(&slice)?; + if caps.get(0)?.start() != 0 { + return None; + } + let component = caps.name("enum")?.as_str().trim_start_matches('\n'); + let has_decorator = caps.name("maybe_decorator").is_some(); + if !syntax && has_decorator { + Some(rich_escape(component)) + } else { + Some(component.to_string()) + } +} + +fn format_trait_mod(node: Node<'_>, content: &str) -> Option { + let start = line_start(content, node.start_byte()); + if start == 0 { + return None; + } + let end = node.end_byte().min(content.len()); + let slice = format!("\n{}", strip_c_comments(&content[start..end])); + let caps = TRAIT_MOD_RE.captures(&slice)?; + if caps.get(0)?.start() != 0 { + return None; + } + Some( + caps.name("trait_mod")? + .as_str() + .trim_start_matches('\n') + .to_string(), + ) +} + +fn format_macro(node: Node<'_>, content: &str) -> Option { + // include a possible #[macro_export] attribute line directly above + let mut start = line_start(content, node.start_byte()); + let prefix = "#[macro_export]\n"; + if start >= prefix.len() && content[..start].ends_with(prefix) { + start -= prefix.len(); + } + if start == 0 { + return None; + } + let end = node.end_byte().min(content.len()); + let slice = format!("\n{}", strip_c_comments(&content[start..end])); + let caps = MACRO_RE.captures(&slice)?; + if caps.get(0)?.start() != 0 { + return None; + } + Some( + caps.name("macro")? + .as_str() + .trim_start_matches('\n') + .to_string(), + ) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn basic_fn() { + let content = "fn add(x1: i64, x2: i64) -> i64 {\n x1 + x2\n}\n"; + assert_eq!( + extract(content, false).unwrap(), + vec!["fn add(x1: i64, x2: i64) -> i64"] + ); + } + + #[test] + fn impl_and_method() { + let content = "struct Point;\nimpl Point {\n fn get_origin() -> Point {\n Point\n }\n}\n"; + assert_eq!( + extract(content, false).unwrap(), + vec!["impl Point", " fn get_origin() -> Point"] + ); + } + + #[test] + fn escape_matches_rich() { + assert_eq!(rich_escape("#[default]"), "#\\[default]"); + assert_eq!(rich_escape("[Topic]"), "[Topic]"); // uppercase: untouched + } +} diff --git a/crates/tree_plus_core/src/extract/treesitter/typescript.rs b/crates/tree_plus_core/src/extract/treesitter/typescript.rs new file mode 100644 index 0000000..55dd887 --- /dev/null +++ b/crates/tree_plus_core/src/extract/treesitter/typescript.rs @@ -0,0 +1,420 @@ +//! JavaScript/TypeScript component extraction (tree-sitter). +//! +//! Emits source-text signature slices in legacy `parse_ts` style: +//! - classes/interfaces with their heritage clauses; +//! - methods and constructors (signature only, original indentation); +//! - function declarations and expressions (signature only); +//! - arrow functions bound to variables or object members, ending at `=>`; +//! - `const name = {` "object scope" lines when the object holds functions; +//! - `type Name` aliases. +//! +//! Known intentional differences from the legacy regex (documented in +//! docs/rust-port-differences.md): regex noise like bare call statements +//! (`super(...)`, `innerFunction("inner")`) and string literals containing +//! the word "function" are no longer emitted. + +use std::sync::LazyLock; + +use regex::Regex; +use tree_sitter::Node; + +use super::rust::strip_c_comments; +use super::{parse, ExtractResult}; + +/// Legacy `parse_ts` function preamble: what may precede the `function` +/// keyword on its line for the match to fire. +static FN_PREAMBLE_RE: LazyLock = LazyLock::new(|| { + Regex::new( + r"^ *(export (default )?)?(\(|(const|var|let) \w+ = |return |\w+: )?(async )? ?(\w+\()?$", + ) + .unwrap() +}); + +/// Legacy jsdoc line patterns (applied to a `/** ... */` comment preceding a +/// component; tagged lines keep matching, untagged only before any tag). +static JSDOC_TAGGED_RE: LazyLock = LazyLock::new(|| { + Regex::new(r"(?m)^ +\* +@(category|typedefn|sig|param|returns?)( \{\{?[\s\S]*?\}?\})?.*") + .unwrap() +}); +static JSDOC_UNTAGGED_RE: LazyLock = + LazyLock::new(|| Regex::new(r"(?m)^ \* \w+.*").unwrap()); + +/// Extract JS/TS components. +pub fn extract(content: &str, tsx: bool) -> ExtractResult { + let language = if tsx { + tree_sitter_typescript::LANGUAGE_TSX + } else { + tree_sitter_typescript::LANGUAGE_TYPESCRIPT + }; + let tree = parse(content, &language.into())?; + let mut extractor = TsExtractor { + content, + components: Vec::new(), + }; + extractor.visit(tree.root_node()); + Ok(extractor.components) +} + +struct TsExtractor<'a> { + content: &'a str, + components: Vec, +} + +fn line_start(content: &str, byte: usize) -> usize { + content[..byte].rfind('\n').map(|i| i + 1).unwrap_or(0) +} + +impl<'a> TsExtractor<'a> { + /// Legacy jsdoc carry: a `/** ... */` comment directly above a component + /// contributes its tagged (and leading untagged) lines. + fn jsdoc_prefix(&self, slice_start: usize) -> Option { + let before = self.content[..slice_start].trim_end(); + if !before.ends_with("*/") { + return None; + } + let open = before.rfind("/**")?; + let comment = &before[open..]; + let mut lines: Vec<&str> = Vec::new(); + let mut seen_tagged = false; + let mut pos = 0; + while pos < comment.len() { + let tagged = JSDOC_TAGGED_RE.find_at(comment, pos); + let untagged = JSDOC_UNTAGGED_RE.find_at(comment, pos); + let pick = match (tagged, untagged) { + (Some(t), Some(u)) => { + if t.start() <= u.start() { + (t, true) + } else { + (u, false) + } + } + (Some(t), None) => (t, true), + (None, Some(u)) => (u, false), + (None, None) => break, + }; + let (m, is_tagged) = pick; + if is_tagged { + seen_tagged = true; + lines.push(m.as_str()); + } else if !seen_tagged { + lines.push(m.as_str()); + } + pos = m.end().max(pos + 1); + } + if lines.is_empty() { + return None; + } + Some(format!("/**\n{}\n */\n", lines.join("\n"))) + } + + fn push_with_jsdoc(&mut self, slice_start: usize, component: String) { + match self.jsdoc_prefix(slice_start) { + Some(prefix) => self.components.push(format!("{prefix}{component}")), + None => self.components.push(component), + } + } + + fn slice(&self, from: usize, to: usize) -> String { + // legacy parse_ts ran on comment-stripped content + strip_c_comments(&self.content[from..to]) + .trim_end() + .to_string() + } + + fn visit(&mut self, node: Node<'a>) { + match node.kind() { + "class_declaration" | "abstract_class_declaration" => { + self.emit_class_like(node); + if let Some(body) = node.child_by_field_name("body") { + self.visit(body); + } + } + "interface_declaration" => { + self.emit_class_like(node); + } + "type_alias_declaration" => { + if let Some(name) = node.child_by_field_name("name") { + let mut label = format!("type {}", &self.content[name.byte_range()]); + if let Some(tp) = node.child_by_field_name("type_parameters") { + label.push_str(&self.content[tp.byte_range()]); + } + self.components.push(label); + } + } + "function_declaration" | "generator_function_declaration" => { + self.emit_function(node); + if let Some(body) = node.child_by_field_name("body") { + self.visit(body); + } + } + "function_expression" | "generator_function" => { + if self.function_expression_context(node) { + self.emit_function(node); + } + if let Some(body) = node.child_by_field_name("body") { + self.visit(body); + } + } + "method_definition" | "method_signature" | "abstract_method_signature" => { + self.emit_signature_to_params_or_return(node); + if let Some(body) = node.child_by_field_name("body") { + self.visit(body); + } + } + "arrow_function" => { + if self.arrow_context(node) { + self.emit_arrow(node); + } + if let Some(body) = node.child_by_field_name("body") { + self.visit(body); + } + } + "expression_statement" => { + // legacy "method" noise: indented bare calls like `super(x)` + if let Some(call) = node + .named_child(0) + .filter(|c| matches!(c.kind(), "call_expression")) + { + self.maybe_emit_bare_call(node, call); + } + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + self.visit(child); + } + } + "variable_declarator" => { + // object scope: `const name = {` when the object holds functions + if let Some(value) = node.child_by_field_name("value") { + if value.kind() == "object" && object_holds_functions(value) { + let start = line_start(self.content, node.start_byte()); + // slice through the object's opening brace + let brace = value.start_byte(); + self.components + .push(self.content[start..=brace].trim_end().to_string()); + } + self.visit(value); + } + } + _ => { + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + self.visit(child); + } + } + } + } + + /// class/interface: line start through heritage (extends/implements). + fn emit_class_like(&mut self, node: Node<'a>) { + let Some(name) = node.child_by_field_name("name") else { + return; + }; + let mut end = name.end_byte(); + if let Some(tp) = node.child_by_field_name("type_parameters") { + end = end.max(tp.end_byte()); + } + let mut cursor = node.walk(); + for child in node.children(&mut cursor) { + if matches!( + child.kind(), + "class_heritage" | "extends_clause" | "implements_clause" | "extends_type_clause" + ) { + end = end.max(child.end_byte()); + } + } + let start = line_start(self.content, outer_start(node)); + self.components.push(self.slice(start, end)); + } + + /// Functions: line start through return type or params; legacy stripped + /// leading `(` from IIFE matches. + fn emit_function(&mut self, node: Node<'a>) { + let Some(params) = node.child_by_field_name("parameters") else { + return; + }; + let end = node + .child_by_field_name("return_type") + .map(|r| { + // legacy return-type capture was lazy and stopped before " {" + let text = &self.content[r.byte_range()]; + match text.find(" {") { + Some(i) => r.start_byte() + i, + None => r.end_byte(), + } + }) + .unwrap_or_else(|| params.end_byte()); + let start = line_start(self.content, outer_start(node)); + let text = self.slice(start, end); + self.push_with_jsdoc(start, text.trim_start_matches('(').to_string()); + } + + /// Methods/constructors: indentation + signature through return type. + fn emit_signature_to_params_or_return(&mut self, node: Node<'a>) { + let Some(params) = node.child_by_field_name("parameters") else { + return; + }; + let end = node + .child_by_field_name("return_type") + .map(|r| r.end_byte()) + .unwrap_or_else(|| params.end_byte()); + let start = line_start(self.content, node.start_byte()); + self.components.push(self.slice(start, end)); + } + + /// Legacy method-pattern noise: an indented statement `name(args);` (or + /// `super(args);`) matched the method signature regex and was emitted. + fn maybe_emit_bare_call(&mut self, stmt: Node<'a>, call: Node<'a>) { + let Some(callee) = call.child_by_field_name("function") else { + return; + }; + if !matches!(callee.kind(), "identifier" | "super") { + return; + } + let Some(args) = call.child_by_field_name("arguments") else { + return; + }; + // `\w+\(...` -- no gap between callee and args, single-line args + if args.start_byte() != callee.end_byte() { + return; + } + if self.content[args.byte_range()].contains('\n') { + return; + } + // must start an indented line + let start = line_start(self.content, stmt.start_byte()); + let indent = &self.content[start..callee.start_byte()]; + if indent.is_empty() || !indent.chars().all(|c| c == ' ' || c == '\t') { + return; + } + // fin: `;` or ` {` after the call (legacy pattern requirement) + let after = &self.content[call.end_byte()..]; + if !(after.starts_with(';') || after.starts_with(" {")) { + return; + } + self.components.push(self.slice(start, call.end_byte())); + } + + /// Arrows: line start through the `=>` token. + fn emit_arrow(&mut self, node: Node<'a>) { + let mut arrow_end = None; + let mut cursor = node.walk(); + for child in node.children(&mut cursor) { + if child.kind() == "=>" { + arrow_end = Some(child.end_byte()); + break; + } + } + let Some(end) = arrow_end else { return }; + let start = line_start(self.content, outer_start(node)); + let text = self.slice(start, end); + self.push_with_jsdoc(start, text); + } + + /// Legacy gate for function expressions: the text before the `function` + /// keyword on its line had to match the legacy preamble pattern + /// (optional export/assignment/return/key, optional `name(` wrapper). + fn function_expression_context(&self, node: Node<'a>) -> bool { + let fn_kw = node.start_byte(); + let start = line_start(self.content, fn_kw); + let prefix = &self.content[start..fn_kw]; + FN_PREAMBLE_RE.is_match(prefix) + } + + fn arrow_context(&self, node: Node<'a>) -> bool { + let Some(parent) = node.parent() else { + return false; + }; + matches!( + parent.kind(), + "variable_declarator" | "pair" | "assignment_expression" + ) + } +} + +/// Start byte including an `export ...` wrapper, but excluding decorators +/// (the legacy extractor never captured them). +fn outer_start(node: Node<'_>) -> usize { + // skip the node's own leading decorators + let mut start = node.start_byte(); + let mut cursor = node.walk(); + for child in node.children(&mut cursor) { + if child.kind() != "decorator" { + start = child.start_byte(); + break; + } + } + if let Some(parent) = node.parent() { + if parent.kind() == "export_statement" { + // anchor on the `export` keyword itself, past any decorators + let mut pcursor = parent.walk(); + for child in parent.children(&mut pcursor) { + if child.kind() == "export" { + return child.start_byte(); + } + } + } + } + start +} + +/// Whether an object literal directly contains function/arrow members. +fn object_holds_functions(object: Node<'_>) -> bool { + let mut cursor = object.walk(); + for pair in object.named_children(&mut cursor) { + if pair.kind() == "pair" { + if let Some(value) = pair.child_by_field_name("value") { + if matches!( + value.kind(), + "function_expression" | "arrow_function" | "generator_function" + ) { + return true; + } + } + } + if pair.kind() == "method_definition" { + return true; + } + } + false +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn class_with_methods() { + let content = "export class AppComponent {\n constructor(private http: HttpClient) {}\n checkSession() {}\n async goToEvent(event_id: string) {}\n}\n"; + assert_eq!( + extract(content, false).unwrap(), + vec![ + "export class AppComponent", + " constructor(private http: HttpClient)", + " checkSession()", + " async goToEvent(event_id: string)", + ] + ); + } + + #[test] + fn arrows_and_functions() { + let content = "const arrow = (a: String, b: Number) => {\n return a;\n};\nfunction tsFunction() {\n}\n"; + assert_eq!( + extract(content, false).unwrap(), + vec![ + "const arrow = (a: String, b: Number) =>", + "function tsFunction()" + ] + ); + } + + #[test] + fn object_scope() { + let content = + "const myObject = {\n myMethod: function (stuff) {\n return stuff;\n },\n};\n"; + assert_eq!( + extract(content, false).unwrap(), + vec!["const myObject = {", " myMethod: function (stuff)"] + ); + } +} diff --git a/crates/tree_plus_core/src/ignore.rs b/crates/tree_plus_core/src/ignore.rs new file mode 100644 index 0000000..d102eba --- /dev/null +++ b/crates/tree_plus_core/src/ignore.rs @@ -0,0 +1,307 @@ +//! Ignore and glob handling, matching legacy `tree_plus_src/ignore.py`. +//! +//! Legacy semantics: +//! - `should_ignore(path, ignore, globs)` returns true when any *component* +//! of the normalized path matches any ignore pattern via `fnmatch.fnmatch`, +//! or when amortized globs are present and the path is not in the match set. +//! - `DEFAULT_IGNORE` is a fixed pattern set; user patterns are unioned with +//! it unless `override` is set. + +use std::collections::HashSet; +use std::path::{Path, PathBuf}; + +use regex::Regex; + +/// Default ignore patterns (legacy `DEFAULT_IGNORE_FROZENSET`). +pub const DEFAULT_IGNORE: &[&str] = &[ + "__init__.py", + "__pycache__", + "._*", + ".angular", + ".cache", + ".coverage", + ".DS_Store", + ".flake8", + ".git", + ".hypothesis", + ".idea", + ".ipynb_checkpoints", + ".pytest_cache", + ".rustc_info.json", + ".vscode", + "*_memmap", + "*:Zone.Identifier", + "*.a", + "*.ai", + "*.bak", + "*.bin", + "*.bz2", + "*.chk", + "*.class", + "*.d", + "*.dat", + "*.dll", + "*.dylib", + "*.ear", + "*.egg-info", + "*.eot", + "*.eps", + "*.flac", + "*.flv", + "*.framework", + "*.img", + "*.ipa", + "*.iso", + "*.jar", + "*.lib", + "*.lock", + "*.log", + "*.nib", + "*.node", + "*.o", + "*.obj", + "*.odg", + "*.pack", + "*.psd", + "*.pyc", + "*.pyd", + "*.pyo", + "*.rar", + "*.rlib", + "*.rmeta", + "*.so", + "*.storyboardc", + "*.swo", + "*.swp", + "*.tar", + "*.tml", + "*.ttf", + "*.war", + "*.woff", + "*.xcarchive", + "*.zip", + "*.zst", + "**/target/debug/**", + "**/tmp/", + "*~", + "babel-webpack", + "build", + "CACHEDIR.TAG", + "Cargo.lock", + "detritus", + "dist", + "env", + "node_modules", + "target", + "venv", +]; + +/// Translate a Python `fnmatch` pattern to an anchored regex string. +/// +/// Supports `*`, `?`, `[seq]`, `[!seq]`; everything else is escaped, matching +/// `fnmatch.translate` semantics closely enough for the ignore pattern set. +pub fn fnmatch_translate(pattern: &str) -> String { + let chars: Vec = pattern.chars().collect(); + let mut out = String::from("(?s)^"); + let mut i = 0; + while i < chars.len() { + let c = chars[i]; + i += 1; + match c { + '*' => out.push_str(".*"), + '?' => out.push('.'), + '[' => { + let mut j = i; + if j < chars.len() && (chars[j] == '!' || chars[j] == ']') { + j += 1; + } + while j < chars.len() && chars[j] != ']' { + j += 1; + } + if j >= chars.len() { + out.push_str("\\["); + } else { + let inner: String = chars[i..j].iter().collect(); + let inner = inner.replace('\\', "\\\\"); + out.push('['); + if let Some(rest) = inner.strip_prefix('!') { + out.push('^'); + out.push_str(rest); + } else if inner.starts_with('^') { + out.push('\\'); + out.push_str(&inner); + } else { + out.push_str(&inner); + } + out.push(']'); + i = j + 1; + } + } + other => { + if "\\.+()|{}^$".contains(other) { + out.push('\\'); + } + out.push(other); + } + } + } + out.push('$'); + out +} + +/// A compiled set of ignore patterns. +#[derive(Debug, Clone)] +pub struct IgnorePatterns { + regexes: Vec, +} + +impl IgnorePatterns { + pub fn new, S: AsRef>(patterns: I) -> Self { + let regexes = patterns + .into_iter() + .filter_map(|p| Regex::new(&fnmatch_translate(p.as_ref())).ok()) + .collect(); + IgnorePatterns { regexes } + } + + pub fn is_empty(&self) -> bool { + self.regexes.is_empty() + } + + /// Whether any path component matches any pattern (legacy `should_ignore`). + pub fn matches_path(&self, path: &Path) -> bool { + if self.regexes.is_empty() { + return false; + } + for part in path.iter() { + let Some(part) = part.to_str() else { continue }; + if self.regexes.iter().any(|r| r.is_match(part)) { + return true; + } + } + false + } +} + +/// Build the effective ignore pattern list (legacy `parse_ignore`). +pub fn parse_ignore(user_patterns: &[String], override_ignore: bool) -> Vec { + let mut patterns: Vec = user_patterns.to_vec(); + if !override_ignore { + patterns.extend(DEFAULT_IGNORE.iter().map(|s| s.to_string())); + } + patterns.sort(); + patterns.dedup(); + patterns +} + +/// Pre-resolved glob matches (legacy `AmortizedGlobs`): the set of every +/// matching path plus all ancestor directories. +#[derive(Debug, Clone)] +pub struct AmortizedGlobs { + pub matches: HashSet, +} + +impl AmortizedGlobs { + pub fn contains(&self, path: &Path) -> bool { + self.matches.contains(path) + } +} + +/// Recursively find glob matches under `paths` (legacy `amortize_globs`). +/// +/// Patterns are name globs (like `Path.rglob`), matched against each entry's +/// file name at any depth. Returns `None` when nothing matches. +pub fn amortize_globs(paths: &[PathBuf], globs: &[String]) -> Option { + if paths.is_empty() || globs.is_empty() { + return None; + } + let regexes: Vec = globs + .iter() + .filter_map(|g| Regex::new(&fnmatch_translate(g)).ok()) + .collect(); + let mut matches: HashSet = HashSet::new(); + fn walk(dir: &Path, regexes: &[Regex], matches: &mut HashSet) { + let Ok(entries) = std::fs::read_dir(dir) else { + return; + }; + for entry in entries.flatten() { + let p = entry.path(); + let name_matches = p + .file_name() + .and_then(|n| n.to_str()) + .map(|n| regexes.iter().any(|r| r.is_match(n))) + .unwrap_or(false); + if name_matches && !matches.contains(&p) { + matches.insert(p.clone()); + for parent in p.ancestors().skip(1) { + matches.insert(parent.to_path_buf()); + } + } + if p.is_dir() { + walk(&p, regexes, matches); + } + } + } + for path in paths { + walk(path, ®exes, &mut matches); + } + if matches.is_empty() { + return None; + } + Some(AmortizedGlobs { matches }) +} + +/// Combined skip decision (legacy `should_ignore`). +pub fn should_ignore(path: &Path, ignore: &IgnorePatterns, globs: Option<&AmortizedGlobs>) -> bool { + if let Some(globs) = globs { + if !globs.contains(path) { + return true; + } + } + ignore.matches_path(path) +} + +/// Whether a string looks like a glob (legacy `is_glob`). +pub fn is_glob(x: &str) -> bool { + static GLOB_RE: std::sync::LazyLock = + std::sync::LazyLock::new(|| Regex::new(r"(\*)?\*|\?|\[(.\-.)+\]").unwrap()); + GLOB_RE.is_match(x) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn fnmatch_star_and_class() { + let p = IgnorePatterns::new(["*.pyc"]); + assert!(p.matches_path(Path::new("a/b/c.pyc"))); + assert!(!p.matches_path(Path::new("a/b/c.py"))); + let dot = IgnorePatterns::new(["._*"]); + assert!(dot.matches_path(Path::new("x/._hidden"))); + assert!(!dot.matches_path(Path::new("x/h._idden_not_component"))); + } + + #[test] + fn component_matching() { + let p = IgnorePatterns::new(["target"]); + assert!(p.matches_path(Path::new("crates/x/target/debug/foo"))); + assert!(!p.matches_path(Path::new("crates/x/targeted/foo"))); + } + + #[test] + fn is_glob_examples() { + assert!(is_glob("*.rs")); + assert!(is_glob("file?.py")); + assert!(!is_glob("plain.txt")); + } + + #[test] + fn parse_ignore_union() { + let with_default = parse_ignore(&["*.java".to_string()], false); + assert!(with_default.iter().any(|p| p == "*.java")); + assert!(with_default.iter().any(|p| p == ".git")); + let only_user = parse_ignore(&["*.java".to_string()], true); + assert_eq!(only_user, vec!["*.java".to_string()]); + } +} diff --git a/crates/tree_plus_core/src/lib.rs b/crates/tree_plus_core/src/lib.rs new file mode 100644 index 0000000..6b021b9 --- /dev/null +++ b/crates/tree_plus_core/src/lib.rs @@ -0,0 +1,29 @@ +//! tree_plus_core: a `tree` util enhanced with tokens, lines, and components. +//! +//! Rust port of the Python `tree_plus` package (version-1 scope: local +//! filesystem mode). See docs/architecture.md for the design and +//! docs/rust-port-differences.md for intentional differences. + +pub mod config; +pub mod count; +pub mod extract; +pub mod ignore; +pub mod model; +pub mod render; +pub mod sort; +pub mod walk; + +pub use config::TreePlusConfig; +pub use count::{count_tokens_lines, TokenLineCount, TokenizerName}; +pub use extract::extract_components; +pub use ignore::DEFAULT_IGNORE; +pub use model::{Category, TreePlus}; +pub use render::{render_to_string, DEFAULT_WIDTH}; +pub use walk::from_seeds; + +impl TreePlus { + /// Render this tree exactly like the legacy `TreePlus.into_str()`. + pub fn into_str(&self) -> String { + render::render_to_string(self, render::DEFAULT_WIDTH) + } +} diff --git a/crates/tree_plus_core/src/model.rs b/crates/tree_plus_core/src/model.rs new file mode 100644 index 0000000..2e4c10c --- /dev/null +++ b/crates/tree_plus_core/src/model.rs @@ -0,0 +1,125 @@ +//! Core data model: `TreePlus` and `Category`. +//! +//! Mirrors the Python `tree_plus_src.engine.TreePlus` dataclass for the +//! version-1 scope (local filesystem mode: ROOT, GLOB, FOLDER, FILE, +//! COMPONENT categories; URL/TAG are deferred web-mode categories). + +/// Category of a `TreePlus` node. +#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] +pub enum Category { + Root, + Glob, + Folder, + File, + Component, +} + +/// A node in the tree_plus tree. +/// +/// `components` holds the extracted display labels for FILE nodes +/// (the legacy Python stored these in `subtrees` as `List[str]`). +#[derive(Debug, Clone)] +pub struct TreePlus { + pub category: Category, + pub name: String, + pub line_count: u64, + pub token_count: u64, + /// Child trees (folders/files/globs under folders or root). + pub subtrees: Vec, + /// Extracted component labels (only for FILE nodes). + pub components: Vec, +} + +impl TreePlus { + pub fn new(category: Category, name: impl Into) -> Self { + TreePlus { + category, + name: name.into(), + line_count: 0, + token_count: 0, + subtrees: Vec::new(), + components: Vec::new(), + } + } + + pub fn is_folder(&self) -> bool { + self.category == Category::Folder + } + + pub fn is_file(&self) -> bool { + self.category == Category::File + } + + /// Total folder count including self (legacy `n_folders`). + pub fn n_folders(&self) -> u64 { + let own = u64::from(self.is_folder()); + self.subtrees.iter().map(TreePlus::n_folders).sum::() + own + } + + /// Total file count (legacy `n_files`). + pub fn n_files(&self) -> u64 { + let own = u64::from(self.is_file()); + self.subtrees.iter().map(TreePlus::n_files).sum::() + own + } + + /// Total line count (legacy `n_lines`). + pub fn n_lines(&self) -> u64 { + self.line_count + self.subtrees.iter().map(TreePlus::n_lines).sum::() + } + + /// Total token count (legacy `n_tokens`). + pub fn n_tokens(&self) -> u64 { + self.token_count + self.subtrees.iter().map(TreePlus::n_tokens).sum::() + } + + /// Legacy `stats()` string, e.g. `1 folder(s), 6 file(s), 1,234 line(s), 5,678 token(s)`. + pub fn stats(&self) -> String { + format!( + "{} folder(s), {} file(s), {} line(s), {} token(s)", + commafy(self.n_folders()), + commafy(self.n_files()), + commafy(self.n_lines()), + commafy(self.n_tokens()), + ) + } +} + +/// Format an integer with thousands separators like Python's `{:,}`. +pub fn commafy(n: u64) -> String { + let digits = n.to_string(); + let mut out = String::with_capacity(digits.len() + digits.len() / 3); + let offset = digits.len() % 3; + for (i, ch) in digits.chars().enumerate() { + if i != 0 && (i + 3 - offset).is_multiple_of(3) { + out.push(','); + } + out.push(ch); + } + out +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn commafy_matches_python_format() { + assert_eq!(commafy(0), "0"); + assert_eq!(commafy(999), "999"); + assert_eq!(commafy(1000), "1,000"); + assert_eq!(commafy(1234567), "1,234,567"); + } + + #[test] + fn counts_roll_up() { + let mut folder = TreePlus::new(Category::Folder, "f"); + let mut file = TreePlus::new(Category::File, "a.py"); + file.line_count = 10; + file.token_count = 100; + folder.subtrees.push(file); + assert_eq!(folder.n_folders(), 1); + assert_eq!(folder.n_files(), 1); + assert_eq!(folder.n_lines(), 10); + assert_eq!(folder.n_tokens(), 100); + } +} diff --git a/crates/tree_plus_core/src/render.rs b/crates/tree_plus_core/src/render.rs new file mode 100644 index 0000000..438276b --- /dev/null +++ b/crates/tree_plus_core/src/render.rs @@ -0,0 +1,347 @@ +//! Deterministic tree rendering byte-compatible with the legacy renderer +//! (rich.tree.Tree printed via a no-color Console at a fixed width). +//! +//! Reimplements the relevant parts of rich: +//! - tree guides: `" "`, `"│ "`, `"├── "`, `"└── "`; +//! - word wrapping (`rich._wrap.divide_line` with fold=True) measured in +//! terminal cells (emoji are 2 cells wide); +//! - trailing-space removal (legacy `remove_trailing_space`). + +use unicode_width::{UnicodeWidthChar, UnicodeWidthStr}; + +use crate::model::{commafy, Category, TreePlus}; + +pub const DEFAULT_WIDTH: usize = 80; + +const FILE_CHAR: &str = "📄"; +const FOLDER_CHAR: &str = "📁"; +const ROOT_CHAR: &str = "🌵"; +const GLOB_CHAR: &str = "🌀"; + +fn plural(n: u64) -> &'static str { + if n == 1 { + "" + } else { + "s" + } +} + +/// Build the display label for a node (legacy `_into_rich_tree`). +pub fn node_label(node: &TreePlus) -> String { + match node.category { + Category::File => { + let tokens = node.n_tokens(); + let lines = node.n_lines(); + format!( + "{FILE_CHAR} {} ({} token{}, {} line{})", + node.name, + commafy(tokens), + plural(tokens), + commafy(lines), + plural(lines), + ) + } + Category::Folder => { + let folders = node.n_folders(); + let files = node.n_files(); + format!( + "{FOLDER_CHAR} {} ({} folder{}, {} file{})", + node.name, + commafy(folders), + plural(folders), + commafy(files), + plural(files), + ) + } + Category::Root => { + let mut label = format!("{ROOT_CHAR} {}", node.name); + if node.n_tokens() > 0 && node.n_lines() > 0 { + let folders = node.n_folders(); + let files = node.n_files(); + label.push_str(&format!( + " ({} folder{}, {} file{})", + commafy(folders), + plural(folders), + commafy(files), + plural(files), + )); + } + label + } + Category::Glob => { + let n = node.subtrees.len() as u64; + format!( + "{GLOB_CHAR} {} ({} match{})", + node.name, + commafy(n), + if n == 1 { "" } else { "es" }, + ) + } + Category::Component => node.name.clone(), + } +} + +/// Iterate "words" like rich's `re.compile(r"\s*\S+\s*")`. +fn words(text: &str) -> Vec<(usize, String)> { + let chars: Vec = text.chars().collect(); + let mut out = Vec::new(); + let mut pos = 0; + while pos < chars.len() { + let start = pos; + let mut i = pos; + while i < chars.len() && chars[i].is_whitespace() { + i += 1; + } + if i >= chars.len() { + break; // trailing whitespace w/o a word: no match + } + while i < chars.len() && !chars[i].is_whitespace() { + i += 1; + } + while i < chars.len() && chars[i].is_whitespace() { + i += 1; + } + out.push((start, chars[start..i].iter().collect())); + pos = i; + } + out +} + +fn cell_len(s: &str) -> usize { + UnicodeWidthStr::width(s) +} + +/// Split text into chunks of at most `width` cells (rich `chop_cells`). +fn chop_cells(text: &str, width: usize) -> Vec { + let mut lines: Vec = vec![String::new()]; + let mut total_width = 0; + for ch in text.chars() { + let w = UnicodeWidthChar::width(ch).unwrap_or(0); + if total_width + w > width { + lines.push(ch.to_string()); + total_width = w; + } else { + lines.last_mut().unwrap().push(ch); + total_width += w; + } + } + lines +} + +/// Port of `rich._wrap.divide_line` (fold=True). Returns char offsets. +fn divide_line(text: &str, width: usize) -> Vec { + let mut break_positions: Vec = Vec::new(); + let mut cell_offset = 0usize; + for (start, word) in words(text) { + let word_length = cell_len(word.trim_end()); + let remaining_space = width.saturating_sub(cell_offset); + if remaining_space >= word_length { + cell_offset += cell_len(&word); + } else if word_length > width { + // fold the word across multiple lines + let folded = chop_cells(&word, width); + let mut start = start; + let n = folded.len(); + for (i, line) in folded.iter().enumerate() { + if start > 0 { + break_positions.push(start); + } + if i + 1 == n { + cell_offset = cell_len(line); + } else { + start += line.chars().count(); + } + } + } else if cell_offset > 0 && start > 0 { + break_positions.push(start); + cell_offset = cell_len(&word); + } + } + break_positions +} + +/// Wrap one logical line into physical lines at `width` cells. +fn wrap_line(line: &str, width: usize) -> Vec { + let width = width.max(1); + let breaks = divide_line(line, width); + if breaks.is_empty() { + return vec![line.to_string()]; + } + let chars: Vec = line.chars().collect(); + let mut out = Vec::new(); + let mut prev = 0; + for &b in &breaks { + out.push(chars[prev..b].iter().collect()); + prev = b; + } + out.push(chars[prev..].iter().collect()); + out +} + +/// Expand tabs to 8-cell stops like rich's `Text.expand_tabs`. +fn expand_tabs(line: &str, tab_size: usize) -> String { + if !line.contains('\t') { + return line.to_string(); + } + let mut out = String::with_capacity(line.len()); + let mut col = 0; + for ch in line.chars() { + if ch == '\t' { + let pad = tab_size - (col % tab_size); + out.extend(std::iter::repeat_n(' ', pad)); + col += pad; + } else { + out.push(ch); + col += UnicodeWidthChar::width(ch).unwrap_or(0); + } + } + out +} + +/// Wrap a (possibly multiline) label into physical lines. +fn wrap_label(label: &str, width: usize) -> Vec { + let mut out = Vec::new(); + for logical in label.split('\n') { + out.extend(wrap_line(&expand_tabs(logical, 8), width)); + } + out +} + +struct Renderer { + width: usize, + out: String, +} + +impl Renderer { + fn emit(&mut self, prefix_first: &str, prefix_rest: &str, label: &str) { + let avail = self.width.saturating_sub(cell_len(prefix_first)).max(1); + for (i, line) in wrap_label(label, avail).into_iter().enumerate() { + let prefix = if i == 0 { prefix_first } else { prefix_rest }; + let mut rendered = format!("{prefix}{line}"); + while rendered.ends_with(' ') { + rendered.pop(); + } + self.out.push_str(&rendered); + self.out.push('\n'); + } + } + + fn render_node(&mut self, node: &TreePlus, ancestors: &str, is_root: bool, is_last: bool) { + let (first, rest) = if is_root { + (String::new(), String::new()) + } else { + let fork = if is_last { "└── " } else { "├── " }; + let cont = if is_last { " " } else { "│ " }; + (format!("{ancestors}{fork}"), format!("{ancestors}{cont}")) + }; + self.emit(&first, &rest, &node_label(node)); + + let child_ancestors = if is_root { + String::new() + } else if is_last { + format!("{ancestors} ") + } else { + format!("{ancestors}│ ") + }; + + if node.category == Category::File { + let n = node.components.len(); + for (i, component) in node.components.iter().enumerate() { + let last = i + 1 == n; + let fork = if last { "└── " } else { "├── " }; + let cont = if last { " " } else { "│ " }; + self.emit( + &format!("{child_ancestors}{fork}"), + &format!("{child_ancestors}{cont}"), + component, + ); + } + } else { + let n = node.subtrees.len(); + for (i, sub) in node.subtrees.iter().enumerate() { + self.render_node(sub, &child_ancestors, false, i + 1 == n); + } + } + } +} + +/// Render a tree to a string identical to legacy `TreePlus.into_str()`. +pub fn render_to_string(root: &TreePlus, width: usize) -> String { + let mut r = Renderer { + width, + out: String::new(), + }; + r.render_node(root, "", true, true); + r.out +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::model::{Category, TreePlus}; + + #[test] + fn single_file_render() { + let mut file = TreePlus::new(Category::File, "file.py"); + file.token_count = 19; + file.line_count = 3; + file.components.push("def hello_world()".to_string()); + let s = render_to_string(&file, DEFAULT_WIDTH); + assert_eq!( + s, + "📄 file.py (19 tokens, 3 lines)\n└── def hello_world()\n" + ); + } + + #[test] + fn wrap_matches_rich_probe() { + // mirrors a probe of the legacy renderer + let mut root = TreePlus::new(Category::Component, "root"); + root.category = Category::Folder; // unused; emit manually below + let long = "function julia_is_awesome(prob::DiffEqBase.AbstractDAEProblem{uType, duType, tType, isinplace};"; + let mut r = Renderer { + width: 80, + out: String::new(), + }; + r.emit("├── ", "│ ", long); + assert_eq!( + r.out, + "├── function julia_is_awesome(prob::DiffEqBase.AbstractDAEProblem{uType, duType,\n│ tType, isinplace};\n" + ); + } + + #[test] + fn fold_long_word() { + let mut r = Renderer { + width: 80, + out: String::new(), + }; + let label = format!("word {} tail", "x".repeat(100)); + r.emit("├── ", "│ ", &label); + let expected = format!( + "├── word\n│ {}\n│ {} tail\n", + "x".repeat(76), + "x".repeat(24) + ); + assert_eq!(r.out, expected); + } + + #[test] + fn multiline_label_continuation() { + let mut folder = TreePlus::new(Category::Folder, "f"); + let mut file = TreePlus::new(Category::File, "a.py"); + file.components + .push(" @staticmethod\n def m()".to_string()); + file.components.push("class X".to_string()); + folder.subtrees.push(file); + let s = render_to_string(&folder, DEFAULT_WIDTH); + let expected = "\ +📁 f (1 folder, 1 file) +└── 📄 a.py (0 tokens, 0 lines) + ├── @staticmethod + │ def m() + └── class X +"; + assert_eq!(s, expected); + } +} diff --git a/crates/tree_plus_core/src/sort.rs b/crates/tree_plus_core/src/sort.rs new file mode 100644 index 0000000..b150c96 --- /dev/null +++ b/crates/tree_plus_core/src/sort.rs @@ -0,0 +1,214 @@ +//! Natural path sorting compatible with `natsort.os_sorted` (POSIX, no PyICU). +//! +//! Empirically verified key structure (natsort 8.4.0): +//! - a path is split into components (by `/`); +//! - the final component is split into (base, suffix...) where at most the +//! last TWO suffixes are split off, a suffix must be `.`-prefixed with +//! total length <= 5 (e.g. `.json` yes, `.jsonl` no), and a purely numeric +//! suffix (e.g. `.12345`) stops suffix splitting entirely; +//! - every piece is chunked into alternating (string, number, string, ...) +//! starting with a (possibly empty) string chunk; digit runs become +//! unsigned integers; text chunks compare case-insensitively (casefold). +//! +//! Ties (identical keys, e.g. names differing only by case) are broken by the +//! raw string to keep the order deterministic; the Python implementation +//! falls back to filesystem enumeration order here (documented difference). + +use std::cmp::Ordering; + +/// One chunk of a natural-sort key piece. +#[derive(Debug, Clone, PartialEq, Eq)] +pub enum Chunk { + Text(String), + Num(u128), +} + +impl Chunk { + fn cmp_chunk(&self, other: &Chunk) -> Ordering { + match (self, other) { + (Chunk::Text(a), Chunk::Text(b)) => a.cmp(b), + (Chunk::Num(a), Chunk::Num(b)) => a.cmp(b), + // alternation makes mixed comparisons rare; numbers first matches + // natsort's empty-string-prefix convention + (Chunk::Num(_), Chunk::Text(_)) => Ordering::Less, + (Chunk::Text(_), Chunk::Num(_)) => Ordering::Greater, + } + } +} + +/// Natural sort key for one path-like string. +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct OsSortKey { + pieces: Vec>, + raw: String, +} + +impl PartialOrd for OsSortKey { + fn partial_cmp(&self, other: &Self) -> Option { + Some(self.cmp(other)) + } +} + +impl Ord for OsSortKey { + fn cmp(&self, other: &Self) -> Ordering { + let n = self.pieces.len().min(other.pieces.len()); + for i in 0..n { + let a = &self.pieces[i]; + let b = &other.pieces[i]; + let m = a.len().min(b.len()); + for j in 0..m { + match a[j].cmp_chunk(&b[j]) { + Ordering::Equal => {} + non_eq => return non_eq, + } + } + match a.len().cmp(&b.len()) { + Ordering::Equal => {} + non_eq => return non_eq, + } + } + match self.pieces.len().cmp(&other.pieces.len()) { + Ordering::Equal => self.raw.cmp(&other.raw), + non_eq => non_eq, + } + } +} + +/// Chunk a string into alternating text/number chunks, starting with text. +fn chunk(s: &str) -> Vec { + let mut chunks = Vec::new(); + let mut text = String::new(); + let mut num: Option = None; + for ch in s.chars() { + if ch.is_ascii_digit() { + if num.is_none() { + // flush text chunk (natsort keys always start with a string) + chunks.push(Chunk::Text(std::mem::take(&mut text).to_lowercase())); + num = Some(0); + } + let d = u128::from(ch as u8 - b'0'); + num = Some(num.unwrap().saturating_mul(10).saturating_add(d)); + } else { + if let Some(n) = num.take() { + chunks.push(Chunk::Num(n)); + } + text.push(ch); + } + } + if let Some(n) = num { + chunks.push(Chunk::Num(n)); + } else { + chunks.push(Chunk::Text(text.to_lowercase())); + } + chunks +} + +/// Split the last path component into base + up to two short suffixes. +fn split_suffixes(name: &str) -> Vec { + // collect candidate suffixes from the right + let mut parts: Vec = Vec::new(); + let mut base = name.to_string(); + for _ in 0..2 { + // a hidden-file leading dot is not a suffix boundary + let search_region = &base[1.min(base.len())..]; + let Some(dot_rel) = search_region.rfind('.') else { + break; + }; + let dot = dot_rel + 1; + let suffix = &base[dot..]; + // suffix includes the dot; must be <= 5 chars and not purely numeric + if suffix.len() > 5 || suffix.len() < 2 { + break; + } + if suffix[1..].chars().all(|c| c.is_ascii_digit()) { + break; + } + parts.push(suffix.to_string()); + base.truncate(dot); + if base.ends_with('.') { + base.pop(); + } + } + let mut out = vec![base]; + out.extend(parts.into_iter().rev()); + out +} + +/// Build the natural sort key for a path-like string. +pub fn os_sort_key(path: &str) -> OsSortKey { + let mut pieces: Vec> = Vec::new(); + let components: Vec<&str> = path.split('/').filter(|c| !c.is_empty()).collect(); + for (i, comp) in components.iter().enumerate() { + if i + 1 == components.len() { + for piece in split_suffixes(comp) { + pieces.push(chunk(&piece)); + } + } else { + pieces.push(chunk(comp)); + } + } + if pieces.is_empty() { + pieces.push(chunk(path)); + } + OsSortKey { + pieces, + raw: path.to_string(), + } +} + +/// Sort strings like `natsort.os_sorted`. +pub fn os_sorted>(items: &mut [T]) { + items.sort_by_key(|a| os_sort_key(a.as_ref())); +} + +#[cfg(test)] +mod tests { + use super::*; + + fn sorted_strs(mut v: Vec<&str>) -> Vec<&str> { + v.sort_by_key(|a| os_sort_key(a)); + v + } + + #[test] + fn numeric_runs_sort_naturally() { + assert_eq!( + sorted_strs(vec!["file10.py", "file2.py"]), + vec!["file2.py", "file10.py"] + ); + } + + #[test] + fn case_insensitive() { + assert_eq!( + sorted_strs(vec!["CUSTOMER-INVOICE.CBL", "addamt.cobol"]), + vec!["addamt.cobol", "CUSTOMER-INVOICE.CBL"] + ); + assert_eq!( + sorted_strs(vec!["LuaTest.lua", "lesson.cbl", "KotlinTest.kt"]), + vec!["KotlinTest.kt", "lesson.cbl", "LuaTest.lua"] + ); + } + + #[test] + fn suffix_split_rules() { + // ".py" suffix split means "file.py" < "file2.py" (prefix tuple wins) + assert_eq!( + sorted_strs(vec!["file2.py", "file.py"]), + vec!["file.py", "file2.py"] + ); + // long suffixes are not split: a.json < a.jsonl + assert_eq!( + sorted_strs(vec!["a.jsonl", "a.json"]), + vec!["a.json", "a.jsonl"] + ); + } + + #[test] + fn dotfiles_sort_before_letters() { + assert_eq!( + sorted_strs(vec!["claude.md", ".github", ".env.test"]), + vec![".env.test", ".github", "claude.md"] + ); + } +} diff --git a/crates/tree_plus_core/src/walk.rs b/crates/tree_plus_core/src/walk.rs new file mode 100644 index 0000000..c519409 --- /dev/null +++ b/crates/tree_plus_core/src/walk.rs @@ -0,0 +1,260 @@ +//! Tree construction: the Rust analog of `tree_plus_src/engine.py` +//! `from_seeds` / `_map_seeds` / `_from_folder` / `_from_file` / `_from_glob`. + +use std::path::{Path, PathBuf}; + +use rayon::prelude::*; + +use crate::config::TreePlusConfig; +use crate::count::count_tokens_lines; +use crate::extract::extract_components; +use crate::ignore::{ + amortize_globs, is_glob, parse_ignore, should_ignore, AmortizedGlobs, IgnorePatterns, +}; +use crate::model::{Category, TreePlus}; +use crate::sort::os_sort_key; + +/// Build a `TreePlus` from seed path/glob strings (legacy `from_seeds`). +pub fn from_seeds(seeds: &[String], config: &TreePlusConfig) -> TreePlus { + let mut seeds: Vec = if seeds.is_empty() { + vec![std::env::current_dir() + .map(|p| p.to_string_lossy().into_owned()) + .unwrap_or_else(|_| ".".to_string())] + } else { + seeds.to_vec() + }; + // legacy dedupes via set(); order is later fixed by sorting + seeds.sort(); + seeds.dedup(); + + let ignore = IgnorePatterns::new(parse_ignore(&config.ignore, config.override_ignore)); + + // categorize seeds (legacy `_map_seeds`) + let mut folder_paths: Vec = Vec::new(); + let mut file_paths: Vec = Vec::new(); + let mut glob_seed_patterns: Vec = Vec::new(); + let mut filter_globs: Vec = config.globs.clone(); + for seed in &seeds { + let path = Path::new(seed); + if path.is_file() { + file_paths.push(path.to_path_buf()); + } else if path.is_dir() { + folder_paths.push(path.to_path_buf()); + } else if is_glob(seed) { + if seed.starts_with("*.") { + // legacy: bare `*.ext` seeds become filter globs + filter_globs.push(seed.clone()); + } else { + glob_seed_patterns.push(seed.clone()); + } + } + // non-existent non-glob seeds are skipped with a warning in legacy + } + + // amortize filter globs (legacy) + let globs: Option = if !filter_globs.is_empty() { + let glob_roots: Vec = if folder_paths.is_empty() { + vec![std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."))] + } else { + folder_paths.clone() + }; + let mut amortized = amortize_globs(&glob_roots, &filter_globs); + if let Some(a) = amortized.as_mut() { + // directly-provided files dodge the glob filter + for f in &file_paths { + a.matches.insert(f.clone()); + } + } + amortized + } else { + None + }; + if !filter_globs.is_empty() && folder_paths.is_empty() && glob_seed_patterns.is_empty() { + // legacy: globs with no folder seeds scan the cwd + folder_paths.push(std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."))); + } + + // sort all seeds together by natural path order + enum Seed { + Folder(PathBuf), + File(PathBuf), + GlobPattern(String), + } + let mut parsed: Vec<(String, Seed)> = Vec::new(); + for p in folder_paths { + parsed.push((p.to_string_lossy().into_owned(), Seed::Folder(p))); + } + for p in file_paths { + parsed.push((p.to_string_lossy().into_owned(), Seed::File(p))); + } + for g in glob_seed_patterns { + parsed.push((g.clone(), Seed::GlobPattern(g))); + } + parsed.sort_by_key(|a| os_sort_key(&a.0)); + + let forest: Vec = parsed + .into_iter() + .map(|(_, seed)| match seed { + Seed::Folder(p) => from_folder(&p, &ignore, globs.as_ref(), config), + Seed::File(p) => from_file(&p, config), + Seed::GlobPattern(g) => from_glob_pattern(&g, &ignore, config), + }) + .collect(); + + match forest.len() { + 0 => TreePlus::new(Category::Root, "No match"), + 1 => forest.into_iter().next().unwrap(), + _ => { + let mut root = TreePlus::new(Category::Root, "Root"); + root.subtrees = forest; + root + } + } +} + +/// Build a tree from a folder (legacy `_from_folder`). +pub fn from_folder( + folder_path: &Path, + ignore: &IgnorePatterns, + globs: Option<&AmortizedGlobs>, + config: &TreePlusConfig, +) -> TreePlus { + // resolve() fixes the `Path(".").name == ""` case + let name = folder_path + .canonicalize() + .ok() + .and_then(|p| p.file_name().map(|n| n.to_string_lossy().into_owned())) + .unwrap_or_else(|| folder_path.to_string_lossy().into_owned()); + let mut node = TreePlus::new(Category::Folder, name); + + let mut entries: Vec = std::fs::read_dir(folder_path) + .map(|rd| rd.flatten().map(|e| e.path()).collect()) + .unwrap_or_default(); + entries.sort_by_key(|a| os_sort_key(&a.to_string_lossy())); + + // partition first so files can be processed in parallel deterministically + let kept: Vec<(PathBuf, bool)> = entries + .into_iter() + .filter(|p| !should_ignore(p, ignore, globs)) + .filter_map(|p| { + let is_dir = p.is_dir(); + let is_file = p.is_file(); + if is_dir || is_file { + Some((p, is_dir)) + } else { + None + } + }) + .collect(); + + let subtrees: Vec = kept + .into_par_iter() + .map(|(p, is_dir)| { + if is_dir { + from_folder(&p, ignore, globs, config) + } else { + from_file(&p, config) + } + }) + .collect(); + node.subtrees = subtrees; + node +} + +/// Build a tree from a file (legacy `_from_file`). +pub fn from_file(file_path: &Path, config: &TreePlusConfig) -> TreePlus { + let counts = count_tokens_lines(file_path).unwrap_or_default(); + let name = file_path + .file_name() + .map(|n| n.to_string_lossy().into_owned()) + .unwrap_or_else(|| file_path.to_string_lossy().into_owned()); + let mut node = TreePlus::new(Category::File, name); + node.token_count = counts.n_tokens; + node.line_count = counts.n_lines; + if !config.concise && counts.n_tokens <= config.max_tokens { + node.components = extract_components(file_path, config.syntax); + } + node +} + +/// Build a tree from a glob-pattern seed like `tree_plus_src/*.py` +/// (legacy `_from_glob`, which uses non-recursive `Path().glob(pattern)`). +pub fn from_glob_pattern( + pattern: &str, + ignore: &IgnorePatterns, + config: &TreePlusConfig, +) -> TreePlus { + let mut node = TreePlus::new(Category::Glob, pattern); + let mut matches = glob_relative(Path::new(""), pattern); + matches.sort_by_key(|a| os_sort_key(&a.to_string_lossy())); + for m in matches { + if m.is_dir() { + node.subtrees.push(from_folder(&m, ignore, None, config)); + } else if m.is_file() { + node.subtrees.push(from_file(&m, config)); + } + } + node +} + +/// Minimal `pathlib.Path.glob` for relative patterns: segments are fnmatch +/// globs, `**` matches any number of directories. +fn glob_relative(base: &Path, pattern: &str) -> Vec { + use crate::ignore::fnmatch_translate; + use regex::Regex; + + let segments: Vec<&str> = pattern.split('/').filter(|s| !s.is_empty()).collect(); + let mut current: Vec = vec![if base.as_os_str().is_empty() { + PathBuf::from(".") + } else { + base.to_path_buf() + }]; + for (i, segment) in segments.iter().enumerate() { + let last = i + 1 == segments.len(); + let mut next: Vec = Vec::new(); + if *segment == "**" { + for dir in ¤t { + next.push(dir.clone()); + collect_dirs_recursive(dir, &mut next); + } + } else { + let Ok(re) = Regex::new(&fnmatch_translate(segment)) else { + continue; + }; + for dir in ¤t { + let Ok(entries) = std::fs::read_dir(dir) else { + continue; + }; + for entry in entries.flatten() { + let name = entry.file_name(); + let Some(name) = name.to_str() else { continue }; + if re.is_match(name) { + let p = entry.path(); + if last || p.is_dir() { + next.push(p); + } + } + } + } + } + current = next; + } + // strip the leading "./" to mirror Path().glob output + current + .into_iter() + .map(|p| p.strip_prefix("./").map(Path::to_path_buf).unwrap_or(p)) + .collect() +} + +fn collect_dirs_recursive(dir: &Path, out: &mut Vec) { + let Ok(entries) = std::fs::read_dir(dir) else { + return; + }; + for entry in entries.flatten() { + let p = entry.path(); + if p.is_dir() { + out.push(p.clone()); + collect_dirs_recursive(&p, out); + } + } +} diff --git a/crates/tree_plus_core/tests/golden_parity.rs b/crates/tree_plus_core/tests/golden_parity.rs new file mode 100644 index 0000000..468aac0 --- /dev/null +++ b/crates/tree_plus_core/tests/golden_parity.rs @@ -0,0 +1,243 @@ +//! Golden parity tests against the legacy Python implementation. +//! +//! Goldens are generated by `python tests/golden/generate_legacy_goldens.py` +//! at the repository root and stored under tests/golden/legacy/. + +use std::collections::BTreeMap; +use std::path::{Path, PathBuf}; + +use tree_plus_core::{extract_components, from_seeds, TreePlusConfig}; + +fn repo_root() -> PathBuf { + Path::new(env!("CARGO_MANIFEST_DIR")) + .join("../..") + .canonicalize() + .expect("repo root") +} + +/// Extensions and file names covered by the version-1 scope. +fn in_v1_scope(path: &str) -> bool { + let name = path.rsplit('/').next().unwrap_or(path); + let lower = name.to_lowercase(); + if name == "Makefile" || name == "Justfile" || name.starts_with(".env") { + return true; + } + if name.ends_with("Cargo.toml") || name.ends_with("pyproject.toml") { + return true; + } + const V1_EXTS: &[&str] = &[ + ".py", + ".pyi", + ".rs", + ".js", + ".jsx", + ".ts", + ".tsx", + ".c", + ".h", + ".cc", + ".cpp", + ".cu", + ".cuh", + ".hpp", + ".md", + ".markdown", + ".mdx", + ".mdc", + ".json", + ".jsonl", + ".yml", + ".yaml", + ".csv", + ".txt", + ".db", + ".sqlite", + ]; + V1_EXTS.iter().any(|e| lower.ends_with(e)) +} + +/// Intentional differences from the legacy output, documented in +/// docs/rust-port-differences.md. The expectations are adjusted precisely so +/// any other drift still fails. +fn apply_intentional_differences(rel: &str, expected: Vec) -> Vec { + match rel { + // The TensorFlow flag special case was deliberately dropped; the + // remaining C components must still match exactly. + "tests/more_languages/group6/tensorflow_flags.h" => expected + .into_iter() + .filter(|c| { + !(c.starts_with("Flag('") + || c.starts_with("TF_DECLARE_FLAG('") + || c.starts_with("TF_PY_DECLARE_FLAG('")) + }) + .collect(), + // The legacy regex mistook a string literal containing the word + // "function" for a function definition; tree-sitter does not. + "tests/more_languages/group1/test.ts" => expected + .into_iter() + .filter(|c| c != " return(\"Standalone function with parameters\")") + .collect(), + _ => expected, + } +} + +fn parse_golden(path: &Path) -> (String, Vec) { + let raw = std::fs::read_to_string(path).expect("read golden"); + let value: serde_json::Value = serde_json::from_str(&raw).expect("parse golden json"); + let rel = value["path"].as_str().expect("path").to_string(); + let components = value["components"] + .as_array() + .expect("components") + .iter() + .map(|v| v.as_str().expect("component str").to_string()) + .collect(); + (rel, components) +} + +#[test] +fn components_match_legacy_goldens() { + let root = repo_root(); + let goldens_dir = root.join("tests/golden/legacy/components"); + let mut failures: BTreeMap, Vec)> = BTreeMap::new(); + let mut checked = 0; + let mut entries: Vec = std::fs::read_dir(&goldens_dir) + .expect("goldens generated? run python tests/golden/generate_legacy_goldens.py") + .flatten() + .map(|e| e.path()) + .collect(); + entries.sort(); + for golden_path in entries { + let (rel, expected) = parse_golden(&golden_path); + if !in_v1_scope(&rel) { + continue; + } + let expected = apply_intentional_differences(&rel, expected); + let actual = extract_components(&root.join(&rel), false); + checked += 1; + if actual != expected { + failures.insert(rel, (expected, actual)); + } + } + assert!( + checked > 50, + "expected to check many fixtures, got {checked}" + ); + if !failures.is_empty() { + let mut report = format!( + "{} of {} v1-scope fixtures differ from legacy goldens:\n", + failures.len(), + checked + ); + for (rel, (expected, actual)) in &failures { + report.push_str(&format!( + "--- {rel}\n expected ({}): {:?}\n actual ({}): {:?}\n", + expected.len(), + expected, + actual.len(), + actual + )); + } + panic!("{report}"); + } +} + +/// chdir is process-global; serialize the tree tests. +static CWD_LOCK: std::sync::Mutex<()> = std::sync::Mutex::new(()); + +fn tree_parity(name: &str, seeds: &[&str]) { + let root = repo_root(); + // trees_v1: legacy renders with non-v1 extractors stubbed (deferred + // languages keep their counts and markers but lose components) + let golden = + std::fs::read_to_string(root.join(format!("tests/golden/legacy/trees_v1/{name}.txt"))) + .expect("tree golden"); + // legacy goldens were produced with cwd at the repo root + let _guard = CWD_LOCK + .lock() + .unwrap_or_else(std::sync::PoisonError::into_inner); + let original = std::env::current_dir().expect("cwd"); + std::env::set_current_dir(&root).expect("chdir to repo root"); + let seeds: Vec = seeds.iter().map(|s| s.to_string()).collect(); + let config = TreePlusConfig::default(); + let tree = from_seeds(&seeds, &config); + let rendered = tree.into_str(); + std::env::set_current_dir(original).expect("chdir back"); + assert_eq!( + rendered, golden, + "tree render for {name} differs from legacy golden" + ); +} + +#[test] +fn tree_path_to_test_matches() { + tree_parity("path_to_test", &["tests/path_to_test"]); +} + +#[test] +fn tree_dot_dot_matches() { + tree_parity("dot_dot", &["tests/dot_dot"]); +} + +#[test] +fn tree_multi_seed_matches() { + tree_parity( + "multi_seed", + &["tests/path_to_test", "tests/more_languages/group1"], + ); +} + +#[test] +fn tree_group4_matches() { + tree_parity("more_languages_group4", &["tests/more_languages/group4"]); +} + +#[test] +fn tree_group6_matches() { + tree_parity("more_languages_group6", &["tests/more_languages/group6"]); +} + +#[test] +fn tree_more_languages_matches() { + tree_parity("more_languages", &["tests/more_languages"]); +} + +#[test] +fn tree_group1_matches() { + tree_parity("more_languages_group1", &["tests/more_languages/group1"]); +} + +#[test] +fn tree_group2_matches() { + tree_parity("more_languages_group2", &["tests/more_languages/group2"]); +} + +#[test] +fn tree_group3_matches() { + tree_parity("more_languages_group3", &["tests/more_languages/group3"]); +} + +#[test] +fn tree_group5_matches() { + tree_parity("more_languages_group5", &["tests/more_languages/group5"]); +} + +#[test] +fn tree_group7_matches() { + tree_parity("more_languages_group7", &["tests/more_languages/group7"]); +} + +#[test] +fn tree_group_todo_matches() { + tree_parity( + "more_languages_group_todo", + &["tests/more_languages/group_todo"], + ); +} + +#[test] +fn tree_group_lisp_matches() { + tree_parity( + "more_languages_group_lisp", + &["tests/more_languages/group_lisp"], + ); +} diff --git a/crates/tree_plus_core/tests/robustness.rs b/crates/tree_plus_core/tests/robustness.rs new file mode 100644 index 0000000..79bb957 --- /dev/null +++ b/crates/tree_plus_core/tests/robustness.rs @@ -0,0 +1,86 @@ +//! Robustness: arbitrary bytes must never panic, hang, or render +//! nondeterministically. (Deterministic pseudo-random corpus; no fuzz deps.) + +use std::io::Write; +use std::path::PathBuf; + +use tree_plus_core::extract_components; + +/// xorshift64* — deterministic byte stream without external crates. +struct Rng(u64); +impl Rng { + fn next_u64(&mut self) -> u64 { + let mut x = self.0; + x ^= x >> 12; + x ^= x << 25; + x ^= x >> 27; + self.0 = x; + x.wrapping_mul(0x2545F4914F6CDD1D) + } + fn fill(&mut self, buf: &mut [u8]) { + for chunk in buf.chunks_mut(8) { + let bytes = self.next_u64().to_le_bytes(); + let n = chunk.len(); + chunk.copy_from_slice(&bytes[..n]); + } + } +} + +const EXTENSIONS: &[&str] = &[ + "py", "rs", "ts", "tsx", "js", "c", "cpp", "h", "md", "json", "jsonl", "yml", "toml", "csv", + "txt", "env", "rst", +]; + +fn write_temp(name: &str, bytes: &[u8]) -> PathBuf { + let dir = std::env::temp_dir().join("tree_plus_robustness"); + std::fs::create_dir_all(&dir).unwrap(); + let path = dir.join(name); + let mut f = std::fs::File::create(&path).unwrap(); + f.write_all(bytes).unwrap(); + path +} + +#[test] +fn arbitrary_bytes_never_panic_and_are_deterministic() { + let mut rng = Rng(0x5EED_CAFE_F00D_BEEF); + for round in 0..40 { + let size = (rng.next_u64() % 4096) as usize + 1; + let mut bytes = vec![0u8; size]; + rng.fill(&mut bytes); + for ext in EXTENSIONS { + let path = write_temp(&format!("garbage_{round}.{ext}"), &bytes); + let first = extract_components(&path, false); + let second = extract_components(&path, false); + assert_eq!(first, second, "nondeterministic output for .{ext}"); + } + } +} + +#[test] +fn truncated_real_sources_never_panic() { + let root = PathBuf::from(env!("CARGO_MANIFEST_DIR")).join("../.."); + let sources = [ + "tests/more_languages/group4/rust_test.rs", + "tests/path_to_test/class_method_type.py", + "tests/more_languages/group1/test.ts", + "tests/more_languages/group3/cpp_test.cpp", + "tests/more_languages/group2/test.csv", + ]; + for rel in sources { + let bytes = std::fs::read(root.join(rel)).unwrap(); + let ext = rel.rsplit('.').next().unwrap(); + // cut at byte boundaries, including mid-UTF-8 and mid-token + for cut in [1, 7, bytes.len() / 3, bytes.len() / 2, bytes.len() - 1] { + let cut = cut.min(bytes.len()); + let path = write_temp(&format!("truncated_{cut}.{ext}"), &bytes[..cut]); + let _ = extract_components(&path, false); // must not panic + } + } +} + +#[test] +fn invalid_utf8_yields_no_components() { + // legacy read_file used strict UTF-8 and returned "" on decode errors + let path = write_temp("invalid.py", &[0xC3, 0x28, b'\n', b'd', b'e', b'f', b' ']); + assert!(extract_components(&path, false).is_empty()); +} From a26c45d3281b5b3200d3a4f97d93af8961c379be Mon Sep 17 00:00:00 2001 From: Bion Howard Date: Tue, 9 Jun 2026 15:48:03 -0400 Subject: [PATCH 4/8] docs: architecture, port differences, language roadmap, performance Plus a Justfile with install (tprs), test, lint, goldens, and bench recipes. Co-Authored-By: Claude Fable 5 --- Justfile | 23 +++++++ docs/architecture.md | 113 ++++++++++++++++++++++++++++++++++ docs/language-roadmap.md | 64 +++++++++++++++++++ docs/performance.md | 64 +++++++++++++++++++ docs/rust-port-differences.md | 78 +++++++++++++++++++++++ 5 files changed, 342 insertions(+) create mode 100644 Justfile create mode 100644 docs/architecture.md create mode 100644 docs/language-roadmap.md create mode 100644 docs/performance.md create mode 100644 docs/rust-port-differences.md diff --git a/Justfile b/Justfile new file mode 100644 index 0000000..797b3fe --- /dev/null +++ b/Justfile @@ -0,0 +1,23 @@ +# tree_plus Rust port + +# install the `tprs` alias binary (avoids PATH collision with Python tree_plus) +install: + cargo install --path crates/tree_plus_cli --bin tprs + +# install both binaries: `tree_plus` and `tprs` +install-all: + cargo install --path crates/tree_plus_cli + +test: + cargo test --workspace --all-features + +lint: + cargo fmt --all -- --check + cargo clippy --all-targets --all-features -- -D warnings + +# regenerate legacy goldens from the Python implementation +goldens: + python tests/golden/generate_legacy_goldens.py + +bench: + cargo bench -p tree_plus_core diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 0000000..00f5351 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,113 @@ +# tree_plus Rust Port — Architecture + +## Layout + +``` +crates/tree_plus_core # library + src/model.rs # TreePlus, Category, counts + src/config.rs # TreePlusConfig (CLI flags map here) + src/walk.rs # traversal + tree construction (legacy engine.py) + src/ignore.rs # fnmatch-compatible ignore + amortized globs + src/sort.rs # natural path sort (natsort.os_sorted parity) + src/count.rs # wc-style token/line counting + src/render.rs # rich-Tree-compatible deterministic renderer + src/extract/mod.rs # dispatch (the legacy parse_file, renamed) + src/extract/markers.rs # TODO/BUG/NOTE markers + src/extract/markdown.rs # Markdown, RST, txt checkboxes + src/extract/simple.rs # .env, requirements, Makefile, Angular helpers + src/extract/data.rs # JSON family, JSONL, YAML, TOML, CSV, SQLite + src/extract/treesitter/ # Rust, Python, JS/TS, C/C++ extractors + tests/golden_parity.rs # byte-level parity vs legacy Python goldens + benches/tree_plus_bench.rs +crates/tree_plus_cli # clap binary named `tree_plus` +tests/golden/ # golden generator + legacy goldens +``` + +## Naming correction + +The legacy `parse_file` name was misleading: it never parsed files +generically; it extracted displayable component labels. The Rust analog is +`extract::extract_components(path, syntax)`. + +## Why Tree-sitter for the major languages + +Rust, Python, JavaScript/TypeScript, C and C++ use tree-sitter grammars: + +- real parsing handles strings, comments, and nesting that regexes cannot; +- invalid syntax degrades to partial components (ERROR-node salvage) instead + of catastrophic behavior; +- parsers are reused per thread (`thread_local` parser, immutable grammars). + +The formatters do **not** dump AST text. Each language has a formatter that +emits source-text signature slices matching the legacy tree_plus style, with +the legacy regex quirks reproduced deliberately where tests depend on them +(e.g. Python `async def` is skipped, Rust `pub(crate)` items are skipped, +C++ enum variants keep their trailing commas). Where the legacy regex +produced demonstrably wrong output, the difference is documented in +docs/rust-port-differences.md rather than replicated. + +## Why regex stays for the simple extractors + +Markers, Markdown/RST headings, .env, requirements.txt, Makefile targets and +similar line-oriented formats are matched with the `regex` crate, compiled +once via `std::sync::LazyLock`. These patterns are anchored, bounded, and +have no look-around, so the linear-time `regex` engine is sufficient and +safe (no catastrophic backtracking by construction). + +## Why fancy-regex is not used + +An early draft ported the legacy TypeScript/C regexes verbatim with +fancy-regex (the originals need look-around). It was removed in favor of +tree-sitter formatters: backtracking regexes in the hot path are exactly the +failure mode the rewrite exists to remove (see `catastrophic.c`, a fixture +that exists because the legacy regex approach nearly exploded on it). + +## Why nom/pest are not used + +No mainstream language gets a homegrown grammar. Tree-sitter covers the big +languages; everything else in scope is line-oriented (regex) or structured +data with a native parser (serde_json, toml, serde_yaml, rusqlite). + +## Data and config parsers + +- JSON / package.json / JSON-RPC / OpenRPC / JSON Schema: `serde_json` + (with `preserve_order` so output order matches the file). +- TOML (Cargo.toml, pyproject.toml): `toml` (with `preserve_order`). +- YAML (k8s, Ansible, GitHub workflows, OpenAPI): `serde_yaml` multi-doc. +- SQLite (.db/.sqlite): `rusqlite` behind the default-on `sqlite` feature. +- CSV: header line split, mirroring the legacy first-3-lines behavior + (the csv crate would change behavior; the legacy reads only the header). + +## Traversal, determinism, parallelism + +- `walk.rs` recursion mirrors the legacy `Path.iterdir()` walk; entries are + sorted with `sort::os_sort_key` before any parallel work, then files are + processed with rayon (`into_par_iter`) and collected in order, so output + is deterministic regardless of thread scheduling. +- Ignores match any path component via fnmatch translation; glob filters are + amortized exactly like the legacy `AmortizedGlobs` (match set + ancestors). +- Binary detection reads at most 1 KiB; counting skips the legacy extension + list; files above 1M tokens skip extraction. + +## Rendering + +`render.rs` reimplements the small slice of `rich` the legacy output +depends on: tree guides, multiline label continuation, cell-width-aware +word wrapping (`divide_line`/`chop_cells` ports, fold semantics), 8-cell tab +expansion, and trailing-space removal. Width is 80 when piped (the legacy +capture width), the terminal width when interactive, and 128 under +`TREE_PLUS_UPDATE_README=YES`. + +## How to add a new language + +1. Add the extension to the dispatch in `extract/mod.rs` (order matters and + is tested; mirror the legacy dispatch position). +2. Tree-sitter route: add the grammar crate, write a formatter in + `extract/treesitter/.rs` that emits signature slices, and recurse + into bodies for nested items. +3. Regenerate goldens: `python tests/golden/generate_legacy_goldens.py` + (remove the language from `DEFERRED_PARSERS` in the generator so the v1 + tree goldens include its components). +4. The fixture parity test picks the extension up via `in_v1_scope` in + `crates/tree_plus_core/tests/golden_parity.rs`; add it there. +5. Add unit tests for the formatter and at least one pathological input. diff --git a/docs/language-roadmap.md b/docs/language-roadmap.md new file mode 100644 index 0000000..5f2cb7a --- /dev/null +++ b/docs/language-roadmap.md @@ -0,0 +1,64 @@ +# Language Roadmap (Rust Port) + +Version-1 implements: Rust, Python, JavaScript, TypeScript, C, C++, Markdown +(+ RST), JSON (package.json / schema / RPC / OpenRPC), JSONL, YAML, TOML +(Cargo/pyproject), CSV, Makefile/Justfile, .env, requirements.txt, SQLite, +and TODO/BUG/NOTE markers everywhere except `.md`/`.txt` (legacy rule). + +Everything below is recognized by the legacy Python implementation and +**deferred**: files keep counts and markers, but emit no components. Legacy +goldens for every deferred language are checked in under +`tests/golden/legacy/components/` and serve as the acceptance contract. +For each: legacy extension(s) → legacy extractor → tree-sitter grammar +availability → suggested path → missing tests. + +| Language | Extensions | Legacy extractor | TS grammar? | Suggested path | Missing tests | +|---|---|---|---|---|---| +| Go | .go | parse_go | yes (tree-sitter-go) | tree-sitter formatter | port golden `go_test.go` | +| Java | .java | parse_java | yes | tree-sitter formatter | port golden `JavaTest.java` | +| Kotlin | .kt | parse_kt | yes (community) | tree-sitter formatter | port golden `KotlinTest.kt` | +| Swift | .swift | parse_swift | yes (community) | tree-sitter formatter | port golden `swift_test.swift` | +| C# | .cs | parse_cs | yes | tree-sitter formatter | port golden `csharp_test.cs` | +| PHP | .php | parse_php | yes | tree-sitter formatter | port golden `php_test.php` | +| Ruby | .rb | parse_rb | yes | tree-sitter formatter | port golden `ruby_test.rb` | +| Scala | .scala | parse_scala | yes (community) | tree-sitter formatter | port golden `scala_test.scala` | +| Julia | .jl | parse_jl | yes (community) | tree-sitter formatter | port golden `JuliaTest.jl` | +| Lua | .lua | parse_lua | yes | tree-sitter formatter | port golden `LuaTest.lua` | +| Haskell | .hs | parse_hs | yes (community) | tree-sitter formatter | port golden `haskell_test.hs` | +| OCaml | .ml | parse_ocaml | yes | tree-sitter formatter | port golden `OcamlTest.ml` | +| F# | .fs | parse_fsharp | partial | regex port (line-oriented) | port golden `fsharp_test.fs` | +| Erlang | .erl .hrl | parse_erl | yes (community) | regex port (attribute lines) | port golden `erl_test.erl` | +| Lisp family | .lisp .clj .scm .el .rkt | parse_lisp | per-dialect | regex port (defun-ish lines) | port group_lisp goldens | +| Perl | .pl | parse_perl | yes (community) | regex port | port golden `perl_test.pl` | +| PowerShell | .ps1 | parse_ps1 | no maintained | regex port | port golden `powershell_test.ps1` | +| Bash | .sh | parse_bash | yes | tree-sitter formatter | port golden `bash_test.sh` | +| Zig | .zig | parse_zig | yes (community) | tree-sitter formatter | port golden `zig_test.zig` | +| R | .r .R | parse_r | yes (community) | regex port | port golden `r_test.r` | +| MATLAB | .m .matlab | parse_matlab | no maintained | regex port + content sniff | port golden `matlab_test.m` | +| Objective-C | .m | parse_objective_c | yes (community) | tree-sitter + content sniff | port golden `objective_c_test.m` | +| Mathematica | .nb .wl | parse_mathematica | no | regex port | port golden `mathematica_test.nb` | +| COBOL | .cbl .cobol | parse_cbl | no maintained | regex port (division/para lines) | port group1 goldens | +| Fortran | .f .f90 ... | parse_fortran | yes (community) | regex port | port golden `fortran_test.f90` | +| APL | .apl | parse_apl | no | regex port | port golden `apl_test.apl` | +| SQL | .sql | parse_sql | yes (community) | regex port (CREATE TABLE) | port golden `sql_test.sql` | +| GraphQL | .graphql | parse_graphql | yes (community) | line port (trivial) | port golden `graphql_test.graphql` | +| Protobuf | .proto | parse_grpc | yes (community) | line/regex port | port golden `proto_test.proto` | +| Cap'n Proto | .capnp | parse_capnp | no | regex port | port golden `capnp_test.capnp` | +| LaTeX | .tex | parse_tex | yes (community) | regex port (sections) | port golden `tex_test.tex` | +| Lean | .lean | parse_lean | community | regex port | port golden `lean_test.lean` | +| Isabelle | .thy | parse_isabelle | no | regex port + symbol table | port golden `isabelle_test.thy` | +| Terraform | .tf | parse_tf | yes (hcl) | regex port | port golden `terraform_test.tf` | +| TCL | .tcl | parse_tcl | community | regex port (proc lines) | port golden `tcl_test.tcl` | +| Metal | .metal | parse_metal | no | regex port | port golden `metal_test.metal` | +| WGSL | .wgsl | parse_wgsl | community | regex port | port golden `wgsl_test.wgsl` | +| HTML | .html | parse_html (returns []) | yes | decide scope first (legacy emitted nothing) | n/a | +| TensorFlow flags | flags-in-tensorflow .cc/.h | parse_tensorflow_flags | n/a | intentionally removed (see differences doc) | n/a | + +Also deferred (not languages): web mode (`URL seeds`, `--yc`), syntax +highlighting, tiktoken tokenizers — see docs/rust-port-differences.md. + +## Suggested order of attack + +1. Go, Java, Kotlin, C#, Ruby, Bash (mature grammars, heavily used). +2. SQL/GraphQL/Protobuf/requirements-style line formats (cheap regex ports). +3. The long tail, prioritized by user demand. diff --git a/docs/performance.md b/docs/performance.md new file mode 100644 index 0000000..b273e28 --- /dev/null +++ b/docs/performance.md @@ -0,0 +1,64 @@ +# Performance Report (Rust Port) + +Measured 2026-06-09 on Linux (WSL2), release build, default features. +Comparisons via hyperfine (warmup 1) against the legacy Python CLI +(Python 3.10, tree_plus 1.0.79). "Full" = components extracted; "concise" += traversal + counting only. + +## End-to-end CLI, this repository (473 files, ~50.7k lines, ~540k tokens) + +| Invocation | Rust | Python | Speedup | +|---|---|---|---| +| `tree_plus -c .` (traversal+counting) | 11.8 ms ± 1.3 | 782.8 ms ± 3.5 | **66×** | +| `tree_plus .` (full extraction) | 28.0 ms ± 1.9 | 2.385 s ± 0.022 | **85×** | + +Throughput at the full-extraction figure: ~17k files/s; ~75 MB/s of +repository text (2.1 MB of counted text; 7.5 MB on disk incl. binaries). + +## Medium repositories (full extraction) + +| Repository | Kind | Stats (Rust footer) | Rust wall | Speedup vs Python | +|---|---|---|---|---| +| Gymnasium | Python | 684 files, 63k lines, 704k tokens | ~20 ms | **84×** | +| diamond-types | Rust | 290 files, 75k lines, 1.27M tokens | ~40 ms | **70×** | +| ramda | JavaScript | 700 files, 46k lines, 389k tokens | ~30 ms | **55×** | + +Reproduce with any local tree: + +``` +cargo build --release +hyperfine './target/release/tree_plus /path/to/repo > /dev/null' \ + 'python tree_plus_cli.py /path/to/repo > /dev/null' +``` + +A Linux-kernel-sized tree was not available locally; the command above is +the reproducible benchmark for one (also: +`TREE_PLUS_BENCH_PATH=/path/to/linux cargo bench -p tree_plus_core`). + +## Single-file extraction latency (criterion, mean) + +| Fixture | Extractor | Mean | +|---|---|---| +| class_method_type.py (101 lines) | tree-sitter Python | 217 µs | +| test.ts (165 lines) | tree-sitter TypeScript | 386 µs | +| rust_test.rs (~190 lines) | tree-sitter Rust | 670 µs | +| catastrophic.c (754 lines, pathological) | tree-sitter C | 1.81 ms | + +Criterion reports p50≈mean here with tight bounds (±1%); the pathological +C file that motivated the rewrite extracts in under 2 ms (the legacy regex +needed a timeout guard for files like it). + +Run: `cargo bench -p tree_plus_core` (whole-repo benches accept +`TREE_PLUS_BENCH_PATH`). + +## Where the time goes / tuning applied + +- Files are processed in parallel with rayon after a deterministic sort, so + output order never depends on scheduling; no locks in the hot path. +- One tree-sitter parser per thread (`thread_local`), grammars shared. +- All regexes compiled once (`LazyLock`); only the linear-time `regex` + crate is used (no backtracking engine anywhere). +- Binary sniffing is capped at 1 KiB; CSV/JSONL read only the first lines; + files over 1M tokens skip extraction (legacy behavior). +- Peak memory (full run on this repo): 43 MB RSS (`/usr/bin/time -v`), + dominated by per-file contents held during parallel extraction. diff --git a/docs/rust-port-differences.md b/docs/rust-port-differences.md new file mode 100644 index 0000000..eb7e7f9 --- /dev/null +++ b/docs/rust-port-differences.md @@ -0,0 +1,78 @@ +# Intentional Differences from the Python Implementation + +Everything not listed here is byte-identical for the version-1 scope; the +golden parity suite (`cargo test -p tree_plus_core --test golden_parity`) +enforces it against outputs captured from the legacy implementation. + +## Output differences (version-1 languages) + +1. **TensorFlow flag special case removed.** Legacy `parse_file` special-cased + `.cc`/`.h` files whose path contained "tensorflow" and whose name contained + "flags", emitting `Flag('...')` components. Deliberately dropped (decision: + too project-specific for a general tool). `tensorflow_flags.h` still gets + full C/C++ extraction. The parity test pins the exact remaining components. + +2. **String literals are no longer mistaken for code.** The legacy TS regex + emitted ` return("Standalone function with parameters")` because the + string contains the word "function". Tree-sitter does not parse string + contents, so this artifact is gone. + +3. **A few regex-noise lines are no longer emitted** for TS/JS and C/C++: + the legacy "method" pattern matched some bare call statements. The common + ones (`super(...)`, simple `name(args);` calls, `while((...))`, indented + `asm(...)`, pybind `.def("...` chains) are still emitted for parity, but + awaited calls and calls with complex callees that the legacy regex happened + to match in other codebases may differ. Fixture-covered cases all match. + +4. **Type predicates are no longer truncated mid-token in new code paths.** + Where the legacy lazily stopped a TS return type at the first ` {` + (e.g. `): ticket is`), the port reproduces the truncation for parity; this + is noted as a known legacy wart to revisit in version 2. + +## Binary name / migration + +The Rust CLI builds two identical binaries: `tree_plus` (legacy-compatible +name) and `tprs` (alias that avoids PATH collisions with the Python +`tree_plus` entry point, e.g. from a conda env). Install one or both: + +``` +cargo install --path crates/tree_plus_cli --bin tprs # alias only +cargo install --path crates/tree_plus_cli # both binaries +``` + +## Behavioral differences + +5. **Tokenizers.** Only the default word-count tokenizer (`wc`: tokens = + characters / 4) is implemented. `-t` / `-T gpt4o` produce an explicit + error instead of importing tiktoken. (`-T wc` is accepted.) + +6. **Web modes are deferred.** URL seeds, `--yc`/`--hn`, `-n`, `-m`, and + `-l` are not implemented; the CLI does not accept the HN flags. Local + filesystem mode is complete. + +7. **Syntax highlighting is deferred.** Output is plain text (identical to + the legacy `into_str()` capture). `-s` is accepted for compatibility. + Note the legacy enum markup escaping (e.g. `#\[default]`) is preserved so + plain-text output matches the legacy renderer byte-for-byte. + +8. **`--timeout` is accepted and ignored.** The Rust extractors are + linear-time; there is no regex timeout to configure. The legacy behavior + of returning zero components when extraction fails is preserved. + +9. **Sort tie-breaking.** Names that differ only by case (identical natural + sort keys) fall back to raw byte order in Rust; Python fell back to + filesystem enumeration order (nondeterministic). Deterministic on purpose. + +10. **`COLUMNS`/TTY width.** When piped, output wraps at width 80 exactly like + the legacy `into_str()`. Interactive terminals use the real width, like + the legacy CLI did; `TREE_PLUS_UPDATE_README=YES` forces 128. The legacy + interactive renderer also applied `tab_size=2` and rich markup; the Rust + CLI always renders the deterministic plain-text form (tab size 8). + +11. **`read_file` caching.** The legacy cached file contents process-wide + (`lru_cache`); the port reads files once per run with no global cache. + +12. **Deferred-language files** (see docs/language-roadmap.md) still appear + in trees with correct counts and TODO/BUG/NOTE markers, but no language + components. The legacy goldens for those languages are kept as the + contract for future ports (`tests/golden/legacy/components/`). From 840145c37c356057389a2e34d3f77be25b2d43d7 Mon Sep 17 00:00:00 2001 From: Bion Howard Date: Tue, 9 Jun 2026 15:51:16 -0400 Subject: [PATCH 5/8] feat: Go component extraction (tree-sitter) type struct/interface headers and column-0 func signatures, matching legacy parse_go exactly (golden-parity tested, including the multiline method signature fixture). Legacy quirks preserved: `func f() {}` and generic type headers are skipped, raw tabs kept in signatures. Co-Authored-By: Claude Fable 5 --- Cargo.lock | 11 ++ Cargo.toml | 1 + crates/tree_plus_core/Cargo.toml | 1 + crates/tree_plus_core/src/extract/mod.rs | 3 +- .../src/extract/treesitter/go.rs | 152 ++++++++++++++++++ .../src/extract/treesitter/mod.rs | 1 + crates/tree_plus_core/tests/golden_parity.rs | 1 + docs/language-roadmap.md | 12 +- tests/golden/generate_legacy_goldens.py | 2 +- tests/golden/legacy/trees/repo_concise.txt | 58 +++---- .../golden/legacy/trees_v1/more_languages.txt | 13 ++ .../legacy/trees_v1/more_languages_group2.txt | 13 ++ 12 files changed, 233 insertions(+), 35 deletions(-) create mode 100644 crates/tree_plus_core/src/extract/treesitter/go.rs diff --git a/Cargo.lock b/Cargo.lock index 1510d6f..700cc68 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -857,6 +857,16 @@ dependencies = [ "tree-sitter-language", ] +[[package]] +name = "tree-sitter-go" +version = "0.25.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c8560a4d2f835cc0d4d2c2e03cbd0dde2f6114b43bc491164238d333e28b16ea" +dependencies = [ + "cc", + "tree-sitter-language", +] + [[package]] name = "tree-sitter-javascript" version = "0.25.0" @@ -928,6 +938,7 @@ dependencies = [ "tree-sitter", "tree-sitter-c", "tree-sitter-cpp", + "tree-sitter-go", "tree-sitter-javascript", "tree-sitter-python", "tree-sitter-rust", diff --git a/Cargo.toml b/Cargo.toml index afff7ec..528074f 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -25,5 +25,6 @@ tree-sitter-javascript = "0.25" tree-sitter-typescript = "0.23" tree-sitter-c = "0.24" tree-sitter-cpp = "0.23" +tree-sitter-go = "0.25" rusqlite = { version = "0.32", features = ["bundled"] } criterion = "0.5" diff --git a/crates/tree_plus_core/Cargo.toml b/crates/tree_plus_core/Cargo.toml index 89fead3..58bf87b 100644 --- a/crates/tree_plus_core/Cargo.toml +++ b/crates/tree_plus_core/Cargo.toml @@ -27,6 +27,7 @@ tree-sitter-typescript.workspace = true tree-sitter-c.workspace = true tree-sitter-cpp.workspace = true rusqlite = { workspace = true, optional = true } +tree-sitter-go.workspace = true [dev-dependencies] criterion.workspace = true diff --git a/crates/tree_plus_core/src/extract/mod.rs b/crates/tree_plus_core/src/extract/mod.rs index c3572ff..5e28bd4 100644 --- a/crates/tree_plus_core/src/extract/mod.rs +++ b/crates/tree_plus_core/src/extract/mod.rs @@ -109,7 +109,7 @@ const MARKDOWN_EXTENSIONS: &[&str] = &[".md", ".markdown", ".mdx", ".mdc"]; /// Rust port version 1. Files still get TODO/BUG/NOTE markers; component /// extraction is tracked in docs/language-roadmap.md. const DEFERRED_EXTENSIONS: &[&str] = &[ - ".php", ".kt", ".swift", ".go", ".sh", ".ps1", ".zig", ".rb", ".sql", ".graphql", ".cs", ".jl", + ".php", ".kt", ".swift", ".sh", ".ps1", ".zig", ".rb", ".sql", ".graphql", ".cs", ".jl", ".scala", ".java", ".pl", ".hs", ".fs", ".lisp", ".clj", ".scm", ".el", ".rkt", ".erl", ".hrl", ".capnp", ".proto", ".tex", ".lean", ".f", ".for", ".f77", ".f90", ".f95", ".f03", ".f08", ".tf", ".thy", ".lua", ".tcl", ".m", ".r", ".nb", ".wl", ".matlab", ".ml", ".cbl", ".cobol", @@ -206,6 +206,7 @@ fn try_extract_components(path: &Path, syntax: bool) -> ExtractResult { ".yml" | ".yaml" => data::extract_yml(&content)?, e if C_EXTENSIONS.contains(&e) => treesitter::c_cpp::extract(&content, e)?, ".rs" => treesitter::rust::extract(&content, syntax)?, + ".go" => treesitter::go::extract(&content)?, ".jsonl" => data::extract_jsonl(&content)?, ".env" => simple::dot_env(&content), ".txt" => { diff --git a/crates/tree_plus_core/src/extract/treesitter/go.rs b/crates/tree_plus_core/src/extract/treesitter/go.rs new file mode 100644 index 0000000..9efde45 --- /dev/null +++ b/crates/tree_plus_core/src/extract/treesitter/go.rs @@ -0,0 +1,152 @@ +//! Go component extraction (tree-sitter), matching legacy `parse_go`. +//! +//! Legacy semantics, reproduced deliberately: +//! - `type Name struct` / `type Name interface` headers (no generics: the +//! legacy pattern required exactly `type \w+ (struct|interface)`); +//! - column-0 `func ...` signatures (declarations and methods), sliced up +//! to the body brace, which must be surrounded by whitespace (the legacy +//! lookahead was `(?=\s\{\s)`, so `func f() {}` never matched); +//! - signatures keep raw source text (tabs included; the legacy extractor +//! did not strip comments for Go). + +use std::sync::LazyLock; + +use regex::Regex; +use tree_sitter::Node; + +use super::{parse, ExtractResult}; + +static TYPE_HEADER_RE: LazyLock = + LazyLock::new(|| Regex::new(r"^ *type \w+ (struct|interface)$").unwrap()); + +/// Extract Go components: type struct/interface headers and func signatures. +pub fn extract(content: &str) -> ExtractResult { + let tree = parse(content, &tree_sitter_go::LANGUAGE.into())?; + let mut components: Vec = Vec::new(); + visit(tree.root_node(), content, &mut components); + Ok(components) +} + +fn line_start(content: &str, byte: usize) -> usize { + content[..byte].rfind('\n').map(|i| i + 1).unwrap_or(0) +} + +fn visit(node: Node<'_>, content: &str, out: &mut Vec) { + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + match child.kind() { + "type_declaration" => { + let mut tcursor = child.walk(); + for spec in child.named_children(&mut tcursor) { + if spec.kind() == "type_spec" { + emit_type_spec(spec, content, out); + } + } + } + "function_declaration" | "method_declaration" => { + emit_func(child, content, out); + if let Some(body) = child.child_by_field_name("body") { + visit(body, content, out); + } + } + _ => visit(child, content, out), + } + } +} + +/// `type Name struct` / `type Name interface` when the body brace follows. +fn emit_type_spec(spec: Node<'_>, content: &str, out: &mut Vec) { + let Some(type_node) = spec.child_by_field_name("type") else { + return; + }; + let keyword_len = match type_node.kind() { + "struct_type" => "struct".len(), + "interface_type" => "interface".len(), + _ => return, + }; + // slice from the `type` keyword line through the struct/interface keyword + let decl_start = spec + .parent() + .map(|p| p.start_byte()) + .unwrap_or_else(|| spec.start_byte()); + let start = line_start(content, decl_start); + let end = type_node.start_byte() + keyword_len; + let header = &content[start..end]; + if !TYPE_HEADER_RE.is_match(header) { + return; // legacy pattern: single-space, no generics, no aliases + } + // legacy lookahead `(?=\s*\{)` + if !content[end..].trim_start().starts_with('{') { + return; + } + out.push(header.to_string()); +} + +/// Column-0 `func` signature up to (not including) the body brace. +fn emit_func(node: Node<'_>, content: &str, out: &mut Vec) { + let start = node.start_byte(); + // legacy `^func`: column 0 only + if line_start(content, start) != start { + return; + } + let Some(body) = node.child_by_field_name("body") else { + return; + }; + // legacy lookahead `(?=\s\{\s)`: whitespace on both sides of the brace + let brace = body.start_byte(); + let before_ok = content[..brace] + .chars() + .next_back() + .is_some_and(char::is_whitespace); + let after_ok = content[brace + 1..] + .chars() + .next() + .is_some_and(char::is_whitespace); + if !before_ok || !after_ok { + return; + } + let component = content[start..brace].trim_end(); + out.push(component.to_string()); +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn types_and_funcs() { + let content = "type Greeting struct {\n\tmessage string\n}\n\nfunc (g Greeting) sayHello() {\n\tfmt.Println(g.message)\n}\n\nfunc createGreeting(m string) Greeting {\n\treturn Greeting{message: m}\n}\n"; + assert_eq!( + extract(content).unwrap(), + vec![ + "type Greeting struct", + "func (g Greeting) sayHello()", + "func createGreeting(m string) Greeting", + ] + ); + } + + #[test] + fn multiline_signature_keeps_tabs() { + let content = "func WithAReasonableName(\n\tctx context.Context,\n\tparam1 string,\n) (resultType, error) {\n\treturn resultType{}, nil\n}\n"; + assert_eq!( + extract(content).unwrap(), + vec!["func WithAReasonableName(\n\tctx context.Context,\n\tparam1 string,\n) (resultType, error)"] + ); + } + + #[test] + fn legacy_quirks_preserved() { + // empty body `{}`: the brace is not whitespace-surrounded -> skipped + assert!(extract("func f() {}\n").unwrap().is_empty()); + // generic types don't match the legacy header pattern + assert!(extract("type Box[T any] struct {\n\tv T\n}\n") + .unwrap() + .is_empty()); + // interfaces are headers too + assert_eq!( + extract("type Animal interface {\n\tSpeak() string\n}\n").unwrap(), + vec!["type Animal interface"] + ); + } +} diff --git a/crates/tree_plus_core/src/extract/treesitter/mod.rs b/crates/tree_plus_core/src/extract/treesitter/mod.rs index 652ad05..feaefe7 100644 --- a/crates/tree_plus_core/src/extract/treesitter/mod.rs +++ b/crates/tree_plus_core/src/extract/treesitter/mod.rs @@ -5,6 +5,7 @@ //! regex-based components; golden parity tests in tests/ enforce this. pub mod c_cpp; +pub mod go; pub mod python; pub mod rust; pub mod typescript; diff --git a/crates/tree_plus_core/tests/golden_parity.rs b/crates/tree_plus_core/tests/golden_parity.rs index 468aac0..8ec8f33 100644 --- a/crates/tree_plus_core/tests/golden_parity.rs +++ b/crates/tree_plus_core/tests/golden_parity.rs @@ -29,6 +29,7 @@ fn in_v1_scope(path: &str) -> bool { ".py", ".pyi", ".rs", + ".go", ".js", ".jsx", ".ts", diff --git a/docs/language-roadmap.md b/docs/language-roadmap.md index 5f2cb7a..a571141 100644 --- a/docs/language-roadmap.md +++ b/docs/language-roadmap.md @@ -1,9 +1,10 @@ # Language Roadmap (Rust Port) -Version-1 implements: Rust, Python, JavaScript, TypeScript, C, C++, Markdown -(+ RST), JSON (package.json / schema / RPC / OpenRPC), JSONL, YAML, TOML -(Cargo/pyproject), CSV, Makefile/Justfile, .env, requirements.txt, SQLite, -and TODO/BUG/NOTE markers everywhere except `.md`/`.txt` (legacy rule). +Version-1 implements: Rust, Python, JavaScript, TypeScript, C, C++, Go, +Markdown (+ RST), JSON (package.json / schema / RPC / OpenRPC), JSONL, YAML, +TOML (Cargo/pyproject), CSV, Makefile/Justfile, .env, requirements.txt, +SQLite, and TODO/BUG/NOTE markers everywhere except `.md`/`.txt` (legacy +rule). Everything below is recognized by the legacy Python implementation and **deferred**: files keep counts and markers, but emit no components. Legacy @@ -14,7 +15,6 @@ availability → suggested path → missing tests. | Language | Extensions | Legacy extractor | TS grammar? | Suggested path | Missing tests | |---|---|---|---|---|---| -| Go | .go | parse_go | yes (tree-sitter-go) | tree-sitter formatter | port golden `go_test.go` | | Java | .java | parse_java | yes | tree-sitter formatter | port golden `JavaTest.java` | | Kotlin | .kt | parse_kt | yes (community) | tree-sitter formatter | port golden `KotlinTest.kt` | | Swift | .swift | parse_swift | yes (community) | tree-sitter formatter | port golden `swift_test.swift` | @@ -59,6 +59,6 @@ highlighting, tiktoken tokenizers — see docs/rust-port-differences.md. ## Suggested order of attack -1. Go, Java, Kotlin, C#, Ruby, Bash (mature grammars, heavily used). +1. Java, Kotlin, C#, Ruby, Bash (mature grammars, heavily used). 2. SQL/GraphQL/Protobuf/requirements-style line formats (cheap regex ports). 3. The long tail, prioritized by user demand. diff --git a/tests/golden/generate_legacy_goldens.py b/tests/golden/generate_legacy_goldens.py index d50bd79..6976136 100644 --- a/tests/golden/generate_legacy_goldens.py +++ b/tests/golden/generate_legacy_goldens.py @@ -50,7 +50,7 @@ def sanitize(p: Path) -> str: DEFERRED_PARSERS = [ - "parse_php", "parse_kt", "parse_swift", "parse_go", "parse_bash", + "parse_php", "parse_kt", "parse_swift", "parse_bash", "parse_ps1", "parse_zig", "parse_rb", "parse_sql", "parse_graphql", "parse_cs", "parse_jl", "parse_scala", "parse_java", "parse_perl", "parse_hs", "parse_fsharp", "parse_lisp", "parse_erl", "parse_capnp", diff --git a/tests/golden/legacy/trees/repo_concise.txt b/tests/golden/legacy/trees/repo_concise.txt index 17182cb..e2786e5 100644 --- a/tests/golden/legacy/trees/repo_concise.txt +++ b/tests/golden/legacy/trees/repo_concise.txt @@ -1,42 +1,44 @@ -📁 tree_plus (44 folders, 439 files) +📁 tree_plus (44 folders, 443 files) ├── 📄 .env.test (4 tokens, 0 lines) -├── 📁 .github (2 folders, 3 files) +├── 📁 .github (2 folders, 4 files) │ ├── 📄 dependabot.yml (128 tokens, 11 lines) -│ └── 📁 workflows (1 folder, 2 files) -│ ├── 📄 microsoft.yml (284 tokens, 40 lines) -│ └── 📄 unix.yml (713 tokens, 92 lines) -├── 📄 .gitignore (219 tokens, 57 lines) +│ └── 📁 workflows (1 folder, 3 files) +│ ├── 📄 microsoft.yml (361 tokens, 54 lines) +│ ├── 📄 rust.yml (292 tokens, 44 lines) +│ └── 📄 unix.yml (790 tokens, 106 lines) +├── 📄 .gitignore (226 tokens, 60 lines) ├── 📄 .mcp_server.pid (2 tokens, 1 line) -├── 📄 Cargo.toml (206 tokens, 29 lines) +├── 📄 Cargo.toml (212 tokens, 30 lines) ├── 📄 claude-fable-5-rust-rewrite-goal.md (3,394 tokens, 434 lines) ├── 📁 coverage (1 folder, 1 file) │ └── 📄 lcov.info (17,359 tokens, 2,180 lines) -├── 📁 crates (11 folders, 27 files) +├── 📁 crates (11 folders, 28 files) │ ├── 📁 tree_plus_cli (3 folders, 3 files) -│ │ ├── 📄 Cargo.toml (86 tokens, 15 lines) +│ │ ├── 📄 Cargo.toml (120 tokens, 21 lines) │ │ ├── 📁 src (1 folder, 1 file) -│ │ │ └── 📄 main.rs (1,332 tokens, 174 lines) +│ │ │ └── 📄 main.rs (1,339 tokens, 173 lines) │ │ └── 📁 tests (1 folder, 1 file) │ │ └── 📄 cli.rs (701 tokens, 92 lines) -│ └── 📁 tree_plus_core (7 folders, 24 files) +│ └── 📁 tree_plus_core (7 folders, 25 files) │ ├── 📁 benches (1 folder, 1 file) │ │ └── 📄 tree_plus_bench.rs (608 tokens, 78 lines) -│ ├── 📄 Cargo.toml (228 tokens, 36 lines) +│ ├── 📄 Cargo.toml (236 tokens, 37 lines) │ ├── 📁 examples (1 folder, 2 files) │ │ ├── 📄 dump_ast.rs (516 tokens, 55 lines) │ │ └── 📄 extract.rs (129 tokens, 16 lines) -│ ├── 📁 src (3 folders, 18 files) +│ ├── 📁 src (3 folders, 19 files) │ │ ├── 📄 config.rs (304 tokens, 39 lines) │ │ ├── 📄 count.rs (1,346 tokens, 203 lines) -│ │ ├── 📁 extract (2 folders, 10 files) +│ │ ├── 📁 extract (2 folders, 11 files) │ │ │ ├── 📄 data.rs (5,115 tokens, 582 lines) │ │ │ ├── 📄 markdown.rs (1,531 tokens, 180 lines) │ │ │ ├── 📄 markers.rs (438 tokens, 60 lines) -│ │ │ ├── 📄 mod.rs (2,520 tokens, 277 lines) +│ │ │ ├── 📄 mod.rs (2,532 tokens, 278 lines) │ │ │ ├── 📄 simple.rs (1,629 tokens, 216 lines) -│ │ │ └── 📁 treesitter (1 folder, 5 files) +│ │ │ └── 📁 treesitter (1 folder, 6 files) │ │ │ ├── 📄 c_cpp.rs (5,979 tokens, 591 lines) -│ │ │ ├── 📄 mod.rs (488 tokens, 66 lines) +│ │ │ ├── 📄 go.rs (1,364 tokens, 152 lines) +│ │ │ ├── 📄 mod.rs (491 tokens, 67 lines) │ │ │ ├── 📄 python.rs (3,487 tokens, 346 lines) │ │ │ ├── 📄 rust.rs (2,785 tokens, 312 lines) │ │ │ └── 📄 typescript.rs (3,897 tokens, 420 lines) @@ -47,20 +49,21 @@ │ │ ├── 📄 sort.rs (1,693 tokens, 214 lines) │ │ └── 📄 walk.rs (2,245 tokens, 260 lines) │ └── 📁 tests (1 folder, 2 files) -│ ├── 📄 golden_parity.rs (1,809 tokens, 243 lines) +│ ├── 📄 golden_parity.rs (1,812 tokens, 244 lines) │ └── 📄 robustness.rs (743 tokens, 86 lines) ├── 📁 docs (1 folder, 4 files) │ ├── 📄 architecture.md (1,392 tokens, 113 lines) │ ├── 📄 language-roadmap.md (1,262 tokens, 64 lines) │ ├── 📄 performance.md (690 tokens, 64 lines) -│ └── 📄 rust-port-differences.md (922 tokens, 67 lines) +│ └── 📄 rust-port-differences.md (1,022 tokens, 78 lines) +├── 📄 Justfile (141 tokens, 23 lines) ├── 📄 LICENSE (2,744 tokens, 81 lines) ├── 📄 Makefile (801 tokens, 121 lines) ├── 📄 nodemon.json (112 tokens, 24 lines) ├── 📄 pyproject.toml (366 tokens, 51 lines) ├── 📄 pytest.ini (20 tokens, 4 lines) ├── 📄 README.md (99,851 tokens, 3,708 lines) -├── 📁 tests (25 folders, 378 files) +├── 📁 tests (25 folders, 379 files) │ ├── 📄 .env.test (4 tokens, 0 lines) │ ├── 📄 build_absurdly_huge_jsonl.py (506 tokens, 65 lines) │ ├── 📁 dot_dot (2 folders, 4 files) @@ -73,10 +76,10 @@ │ │ └── 📁 is_empty (1 folder, 0 files) │ ├── 📁 folder_with_evil_logging (1 folder, 1 file) │ │ └── 📄 logging.py (11 tokens, 0 lines) -│ ├── 📁 golden (6 folders, 246 files) +│ ├── 📁 golden (6 folders, 247 files) │ │ ├── 📄 diff_components.py (417 tokens, 59 lines) -│ │ ├── 📄 generate_legacy_goldens.py (1,499 tokens, 156 lines) -│ │ └── 📁 legacy (5 folders, 244 files) +│ │ ├── 📄 generate_legacy_goldens.py (1,496 tokens, 156 lines) +│ │ └── 📁 legacy (5 folders, 245 files) │ │ ├── 📁 components (1 folder, 109 files) │ │ │ ├── 📄 tests__dot_dot__my_test_file.py.json (21 tokens, 5 lines) │ │ │ ├── 📄 tests__dot_dot__nested_dir__.env.test.json (22 tokens, 5 @@ -513,7 +516,7 @@ │ │ │ ├── 📄 tests__path_to_test__file.py.json (19 tokens, 0 lines) │ │ │ ├── 📄 tests__path_to_test__file.txt.json (20 tokens, 0 lines) │ │ │ └── 📄 tests__path_to_test__version.py.json (20 tokens, 0 lines) -│ │ ├── 📁 trees (1 folder, 13 files) +│ │ ├── 📁 trees (1 folder, 14 files) │ │ │ ├── 📄 dot_dot.txt (99 tokens, 10 lines) │ │ │ ├── 📄 more_languages.txt (22,977 tokens, 2,225 lines) │ │ │ ├── 📄 more_languages_group1.txt (2,909 tokens, 374 lines) @@ -526,12 +529,13 @@ │ │ │ ├── 📄 more_languages_group_lisp.txt (155 tokens, 22 lines) │ │ │ ├── 📄 more_languages_group_todo.txt (1,491 tokens, 111 lines) │ │ │ ├── 📄 multi_seed.txt (3,789 tokens, 427 lines) -│ │ │ └── 📄 path_to_test.txt (445 tokens, 51 lines) +│ │ │ ├── 📄 path_to_test.txt (445 tokens, 51 lines) +│ │ │ └── 📄 repo_concise.txt (9,776 tokens, 699 lines) │ │ └── 📁 trees_v1 (1 folder, 13 files) │ │ ├── 📄 dot_dot.txt (99 tokens, 10 lines) -│ │ ├── 📄 more_languages.txt (10,782 tokens, 1,122 lines) +│ │ ├── 📄 more_languages.txt (10,912 tokens, 1,135 lines) │ │ ├── 📄 more_languages_group1.txt (1,130 tokens, 146 lines) -│ │ ├── 📄 more_languages_group2.txt (415 tokens, 54 lines) +│ │ ├── 📄 more_languages_group2.txt (531 tokens, 67 lines) │ │ ├── 📄 more_languages_group3.txt (1,240 tokens, 149 lines) │ │ ├── 📄 more_languages_group4.txt (1,020 tokens, 123 lines) │ │ ├── 📄 more_languages_group5.txt (2,099 tokens, 235 lines) diff --git a/tests/golden/legacy/trees_v1/more_languages.txt b/tests/golden/legacy/trees_v1/more_languages.txt index a1de0a9..897f9fb 100644 --- a/tests/golden/legacy/trees_v1/more_languages.txt +++ b/tests/golden/legacy/trees_v1/more_languages.txt @@ -190,6 +190,19 @@ │ │ ├── struct cliSSLconfig sslconfig; │ │ └── } config; │ ├── 📄 go_test.go (179 tokens, 46 lines) +│ │ ├── type Greeting struct +│ │ ├── func (g Greeting) sayHello() +│ │ ├── func createGreeting(m string) Greeting +│ │ ├── type SomethingLong struct +│ │ ├── func (s *SomethingLong) WithAReasonableName( +│ │ │ ctx context.Context, +│ │ │ param1 string, +│ │ │ param2 int, +│ │ │ param3 map[string]interface{}, +│ │ │ callback func(int) error, +│ │ │ ) (resultType, error) +│ │ ├── type resultType struct +│ │ └── func main() │ ├── 📄 PerlTest.pl (63 tokens, 20 lines) │ ├── 📄 PhpTest.php (70 tokens, 19 lines) │ ├── 📄 PowershellTest.ps1 (459 tokens, 89 lines) diff --git a/tests/golden/legacy/trees_v1/more_languages_group2.txt b/tests/golden/legacy/trees_v1/more_languages_group2.txt index 854ea5d..ffbf591 100644 --- a/tests/golden/legacy/trees_v1/more_languages_group2.txt +++ b/tests/golden/legacy/trees_v1/more_languages_group2.txt @@ -42,6 +42,19 @@ │ ├── struct cliSSLconfig sslconfig; │ └── } config; ├── 📄 go_test.go (179 tokens, 46 lines) +│ ├── type Greeting struct +│ ├── func (g Greeting) sayHello() +│ ├── func createGreeting(m string) Greeting +│ ├── type SomethingLong struct +│ ├── func (s *SomethingLong) WithAReasonableName( +│ │ ctx context.Context, +│ │ param1 string, +│ │ param2 int, +│ │ param3 map[string]interface{}, +│ │ callback func(int) error, +│ │ ) (resultType, error) +│ ├── type resultType struct +│ └── func main() ├── 📄 PerlTest.pl (63 tokens, 20 lines) ├── 📄 PhpTest.php (70 tokens, 19 lines) ├── 📄 PowershellTest.ps1 (459 tokens, 89 lines) From 0cc4d789f76b217f3a32e9bea692b2409ddae8b9 Mon Sep 17 00:00:00 2001 From: Bion Howard Date: Tue, 9 Jun 2026 15:55:38 -0400 Subject: [PATCH 6/8] fix: collapse match guards for stable clippy; park goal prompt in detritus CI runs stable clippy, which flags collapsible_match patterns the local nightly toolchain accepted. Co-Authored-By: Claude Fable 5 --- .../src/extract/treesitter/c_cpp.rs | 38 +++++++++---------- 1 file changed, 17 insertions(+), 21 deletions(-) diff --git a/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs b/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs index 58a4839..e7ea179 100644 --- a/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs +++ b/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs @@ -220,17 +220,15 @@ impl<'a> CExtractor<'a> { return; }; match type_node.kind() { - "struct_specifier" | "union_specifier" | "class_specifier" => { - if type_node.child_by_field_name("body").is_some() { - self.emit_record(type_node, Some(node.start_byte())); - self.emit_closer(type_node, node); - } + "struct_specifier" | "union_specifier" | "class_specifier" + if type_node.child_by_field_name("body").is_some() => + { + self.emit_record(type_node, Some(node.start_byte())); + self.emit_closer(type_node, node); } - "enum_specifier" => { - if type_node.child_by_field_name("body").is_some() { - self.emit_enum(type_node); - self.emit_closer(type_node, node); - } + "enum_specifier" if type_node.child_by_field_name("body").is_some() => { + self.emit_enum(type_node); + self.emit_closer(type_node, node); } _ => {} } @@ -256,18 +254,16 @@ impl<'a> CExtractor<'a> { } } match type_node.kind() { - "struct_specifier" | "union_specifier" | "class_specifier" => { - if type_node.child_by_field_name("body").is_some() { - let start = line_start(self.content, node.start_byte()); - self.emit_record(type_node, Some(start)); - self.emit_closer(type_node, node); - } + "struct_specifier" | "union_specifier" | "class_specifier" + if type_node.child_by_field_name("body").is_some() => + { + let start = line_start(self.content, node.start_byte()); + self.emit_record(type_node, Some(start)); + self.emit_closer(type_node, node); } - "enum_specifier" => { - if type_node.child_by_field_name("body").is_some() { - self.emit_enum(type_node); - self.emit_closer(type_node, node); - } + "enum_specifier" if type_node.child_by_field_name("body").is_some() => { + self.emit_enum(type_node); + self.emit_closer(type_node, node); } _ => {} } From 6d99a73135b75b4b0bb1434b718002d0a55edf30 Mon Sep 17 00:00:00 2001 From: Bion Howard Date: Tue, 9 Jun 2026 18:15:06 -0400 Subject: [PATCH 7/8] fix: stack overflows on huge trees (torvalds/linux) Two distinct overflows surfaced by running tprs on the Linux kernel: 1. Tree-sitter extractor visitors (C/C++, Rust, Python, TypeScript, Go) recursed on AST depth. arch/x86/kernel/cpu/microcode/intel-ucode-defs.h is a headerless initializer-list fragment that parses as a deeply nested ERROR tree and blew rayon's 2 MiB worker stacks. All visitors now traverse with explicit heap stacks; emission order is unchanged (golden parity suite still passes byte-for-byte). The C/C++ extractor threads deferred work (record closers, template field suppression, ERROR salvage) through a Work queue to preserve legacy ordering. 2. Rayon work-stealing nests from_folder frames on a worker while it waits in join, leaving too little headroom for tree-sitter's C frames on big trees (drivers/ segfaulted in ts_subtree_retain). from_seeds now runs in a dedicated pool with 16 MiB worker stacks. Adds a regression test extracting deeply nested inputs for all six tree-sitter languages on a 512 KiB thread, and records Linux-kernel benchmarks in docs/performance.md (101k files: concise 0.50 s, full 12.4 s; linux/kernel subdir ~56x faster than the Python CLI). Co-Authored-By: Claude Fable 5 --- .../src/extract/treesitter/c_cpp.rs | 148 ++++++++++++------ .../src/extract/treesitter/go.rs | 26 +-- .../src/extract/treesitter/python.rs | 63 ++++---- .../src/extract/treesitter/rust.rs | 45 +++--- .../src/extract/treesitter/typescript.rs | 43 +++-- crates/tree_plus_core/src/walk.rs | 18 +++ crates/tree_plus_core/tests/robustness.rs | 57 +++++++ docs/performance.md | 32 +++- 8 files changed, 304 insertions(+), 128 deletions(-) diff --git a/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs b/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs index e7ea179..f042c33 100644 --- a/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs +++ b/crates/tree_plus_core/src/extract/treesitter/c_cpp.rs @@ -46,7 +46,7 @@ pub fn extract(content: &str, ext: &str) -> ExtractResult { components: Vec::new(), suppress_plain_fields: 0, }; - extractor.visit(tree.root_node()); + extractor.run(tree.root_node()); Ok(extractor.components) } @@ -70,6 +70,25 @@ fn line_start(content: &str, byte: usize) -> usize { content[..byte].rfind('\n').map(|i| i + 1).unwrap_or(0) } +/// Deferred traversal work. AST depth is input-controlled (deeply nested +/// ERROR trees from headerless initializer fragments overflow the small +/// rayon worker stacks), so descents go through an explicit heap stack +/// instead of recursion. Ordering is LIFO: push follow-up work (closers, +/// suppression ends) before the subtree visit that must precede it. +enum Work<'a> { + Visit(Node<'a>), + Closer { type_node: Node<'a>, decl: Node<'a> }, + SalvageFnDecl(Node<'a>), + EndSuppress, +} + +/// Push `node`'s named children so they pop in source order. +fn push_children_rev<'a>(node: Node<'a>, stack: &mut Vec>) { + let mut cursor = node.walk(); + let children: Vec> = node.named_children(&mut cursor).collect(); + stack.extend(children.into_iter().rev().map(Work::Visit)); +} + impl<'a> CExtractor<'a> { fn slice_clean(&self, from: usize, to: usize) -> String { // legacy parse_c ran on comment-stripped content @@ -79,12 +98,24 @@ impl<'a> CExtractor<'a> { .to_string() } - fn visit(&mut self, node: Node<'a>) { + fn run(&mut self, root: Node<'a>) { + let mut stack = vec![Work::Visit(root)]; + while let Some(work) = stack.pop() { + match work { + Work::Visit(node) => self.visit(node, &mut stack), + Work::Closer { type_node, decl } => self.emit_closer(type_node, decl), + Work::SalvageFnDecl(child) => self.salvage_fn_declarator(child), + Work::EndSuppress => self.suppress_plain_fields -= 1, + } + } + } + + fn visit(&mut self, node: Node<'a>, stack: &mut Vec>) { match node.kind() { "function_definition" => { if let Some(body) = node.child_by_field_name("body") { self.emit_function_definition(node, body); - self.visit(body); + stack.push(Work::Visit(body)); } else if let Some(declarator) = find_function_declarator(node) { // `= default;` / `= delete;` members have no body node let start = line_start(self.content, node.start_byte()); @@ -96,22 +127,22 @@ impl<'a> CExtractor<'a> { } } "ERROR" => { - self.salvage_error_region(node); + self.salvage_error_region(node, stack); } "struct_specifier" | "class_specifier" | "union_specifier" => { - self.emit_record(node, None); + self.emit_record(node, None, stack); } "enum_specifier" => { self.emit_enum(node); } "type_definition" => { - self.emit_type_definition(node); + self.emit_type_definition(node, stack); } "declaration" => { - self.emit_declaration(node); + self.emit_declaration(node, stack); } "field_declaration" => { - self.emit_field(node); + self.emit_field(node, stack); } "access_specifier" => { // includes the trailing ':' in the slice @@ -136,7 +167,7 @@ impl<'a> CExtractor<'a> { } } "template_declaration" => { - self.emit_template(node); + self.emit_template(node, stack); } "expression_statement" => { // legacy pybind noise: ` *\w+.def("...` captured to first `;` @@ -153,30 +184,26 @@ impl<'a> CExtractor<'a> { } } } - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - self.visit(child); - } + push_children_rev(node, stack); } "while_statement" | "for_statement" | "if_statement" | "switch_statement" => { self.maybe_emit_control_noise(node); - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - self.visit(child); - } + push_children_rev(node, stack); } _ => { - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - self.visit(child); - } + push_children_rev(node, stack); } } } /// struct/class/union with a body: label + fields (+ optional closer). /// `prefix_start` overrides the slice start (e.g. `typedef `/`static `). - fn emit_record(&mut self, node: Node<'a>, prefix_start: Option) { + fn emit_record( + &mut self, + node: Node<'a>, + prefix_start: Option, + stack: &mut Vec>, + ) { let Some(body) = node.child_by_field_name("body") else { return; // bare type references are not components }; @@ -185,7 +212,7 @@ impl<'a> CExtractor<'a> { if !label.is_empty() { self.components.push(label); } - self.visit(body); + stack.push(Work::Visit(body)); } fn emit_enum(&mut self, node: Node<'a>) { @@ -215,7 +242,7 @@ impl<'a> CExtractor<'a> { } /// `typedef struct {...} Name;` -> "typedef struct", fields, "} Name;". - fn emit_type_definition(&mut self, node: Node<'a>) { + fn emit_type_definition(&mut self, node: Node<'a>, stack: &mut Vec>) { let Some(type_node) = node.child_by_field_name("type") else { return; }; @@ -223,8 +250,12 @@ impl<'a> CExtractor<'a> { "struct_specifier" | "union_specifier" | "class_specifier" if type_node.child_by_field_name("body").is_some() => { - self.emit_record(type_node, Some(node.start_byte())); - self.emit_closer(type_node, node); + // closer pops after the record's deferred body visit + stack.push(Work::Closer { + type_node, + decl: node, + }); + self.emit_record(type_node, Some(node.start_byte()), stack); } "enum_specifier" if type_node.child_by_field_name("body").is_some() => { self.emit_enum(type_node); @@ -235,7 +266,7 @@ impl<'a> CExtractor<'a> { } /// `static struct config {...} config;` and similar declarations. - fn emit_declaration(&mut self, node: Node<'a>) { + fn emit_declaration(&mut self, node: Node<'a>, stack: &mut Vec>) { let Some(type_node) = node.child_by_field_name("type") else { return; }; @@ -258,8 +289,11 @@ impl<'a> CExtractor<'a> { if type_node.child_by_field_name("body").is_some() => { let start = line_start(self.content, node.start_byte()); - self.emit_record(type_node, Some(start)); - self.emit_closer(type_node, node); + stack.push(Work::Closer { + type_node, + decl: node, + }); + self.emit_record(type_node, Some(start), stack); } "enum_specifier" if type_node.child_by_field_name("body").is_some() => { self.emit_enum(type_node); @@ -329,7 +363,7 @@ impl<'a> CExtractor<'a> { Some(text.trim_end_matches(';').trim_end().to_string()) } - fn emit_field(&mut self, node: Node<'a>) { + fn emit_field(&mut self, node: Node<'a>, stack: &mut Vec>) { // legacy field patterns require an indented member starting its line let fstart = line_start(self.content, node.start_byte()); let indent = &self.content[fstart..node.start_byte()]; @@ -343,8 +377,12 @@ impl<'a> CExtractor<'a> { "struct_specifier" | "union_specifier" | "enum_specifier" | "class_specifier" ) && type_node.child_by_field_name("body").is_some() { - self.visit(type_node); - self.emit_closer(type_node, node); // e.g. `} inner;` + // closer (e.g. `} inner;`) pops after the nested record + stack.push(Work::Closer { + type_node, + decl: node, + }); + stack.push(Work::Visit(type_node)); return; } } @@ -378,7 +416,7 @@ impl<'a> CExtractor<'a> { /// Partial recovery inside ERROR regions: invalid syntax must still /// yield the obvious components instead of nothing. - fn salvage_error_region(&mut self, node: Node<'a>) { + fn salvage_error_region(&mut self, node: Node<'a>, stack: &mut Vec>) { static RECORD_HEADER_RE: LazyLock = LazyLock::new(|| Regex::new(r"^[ \t]*((?:class|struct) \w+[^\n{;]*)").unwrap()); let text = &self.content[node.byte_range()]; @@ -387,21 +425,30 @@ impl<'a> CExtractor<'a> { self.components.push(m.as_str().trim_end().to_string()); } } - // ordered pass: salvage loose function declarators, recurse the rest + // ordered pass: salvage loose function declarators, descend the rest let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - if child.kind() == "function_declarator" { - let start = line_start(self.content, child.start_byte()); - let candidate = self.slice_clean(start, child.end_byte()); - if FUNCTION_FORM_RE.is_match(&candidate) { - let after = &self.content[child.end_byte()..]; - let mut chars = after.chars(); - if chars.next().is_some_and(char::is_whitespace) && chars.next() == Some('{') { - self.components.push(candidate); - } + let work: Vec> = node + .named_children(&mut cursor) + .map(|child| { + if child.kind() == "function_declarator" { + Work::SalvageFnDecl(child) + } else { + Work::Visit(child) } - } else { - self.visit(child); + }) + .collect(); + stack.extend(work.into_iter().rev()); + } + + /// A loose `function_declarator` found inside an ERROR region. + fn salvage_fn_declarator(&mut self, child: Node<'a>) { + let start = line_start(self.content, child.start_byte()); + let candidate = self.slice_clean(start, child.end_byte()); + if FUNCTION_FORM_RE.is_match(&candidate) { + let after = &self.content[child.end_byte()..]; + let mut chars = after.chars(); + if chars.next().is_some_and(char::is_whitespace) && chars.next() == Some('{') { + self.components.push(candidate); } } } @@ -454,7 +501,7 @@ impl<'a> CExtractor<'a> { .push(self.slice_clean(start, cond.end_byte()).replace('\t', " ")); } - fn emit_template(&mut self, node: Node<'a>) { + fn emit_template(&mut self, node: Node<'a>, stack: &mut Vec>) { // slice from `template` through the inner declaration's signature let start = line_start(self.content, node.start_byte()); let mut cursor = node.walk(); @@ -464,7 +511,7 @@ impl<'a> CExtractor<'a> { if let Some(body) = child.child_by_field_name("body") { let text = self.slice_clean(start, body.start_byte()); self.components.push(text); - self.visit(body); + stack.push(Work::Visit(body)); } return; } @@ -483,9 +530,10 @@ impl<'a> CExtractor<'a> { if let Some(body) = child.child_by_field_name("body") { let text = self.slice_clean(start, body.start_byte()); self.components.push(text); + // suppression spans exactly the deferred body visit self.suppress_plain_fields += 1; - self.visit(body); - self.suppress_plain_fields -= 1; + stack.push(Work::EndSuppress); + stack.push(Work::Visit(body)); } else { let text = self.slice_clean(start, child.end_byte()); self.components.push(text); diff --git a/crates/tree_plus_core/src/extract/treesitter/go.rs b/crates/tree_plus_core/src/extract/treesitter/go.rs index 9efde45..8fe9bc8 100644 --- a/crates/tree_plus_core/src/extract/treesitter/go.rs +++ b/crates/tree_plus_core/src/extract/treesitter/go.rs @@ -31,25 +31,31 @@ fn line_start(content: &str, byte: usize) -> usize { content[..byte].rfind('\n').map(|i| i + 1).unwrap_or(0) } -fn visit(node: Node<'_>, content: &str, out: &mut Vec) { - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - match child.kind() { +/// Depth-first via an explicit stack: AST depth is input-controlled, and +/// deep expression nesting must not overflow small worker-thread stacks. +fn visit(root: Node<'_>, content: &str, out: &mut Vec) { + let mut stack = vec![root]; + while let Some(node) = stack.pop() { + match node.kind() { "type_declaration" => { - let mut tcursor = child.walk(); - for spec in child.named_children(&mut tcursor) { + let mut tcursor = node.walk(); + for spec in node.named_children(&mut tcursor) { if spec.kind() == "type_spec" { emit_type_spec(spec, content, out); } } } "function_declaration" | "method_declaration" => { - emit_func(child, content, out); - if let Some(body) = child.child_by_field_name("body") { - visit(body, content, out); + emit_func(node, content, out); + if let Some(body) = node.child_by_field_name("body") { + stack.push(body); } } - _ => visit(child, content, out), + _ => { + let mut cursor = node.walk(); + let children: Vec> = node.named_children(&mut cursor).collect(); + stack.extend(children.into_iter().rev()); + } } } } diff --git a/crates/tree_plus_core/src/extract/treesitter/python.rs b/crates/tree_plus_core/src/extract/treesitter/python.rs index 57c39fe..9416412 100644 --- a/crates/tree_plus_core/src/extract/treesitter/python.rs +++ b/crates/tree_plus_core/src/extract/treesitter/python.rs @@ -57,7 +57,7 @@ pub fn extract(content: &str) -> ExtractResult { pending_decorators: Vec::new(), in_class: false, }; - extractor.visit(tree.root_node()); + extractor.run(tree.root_node()); Ok(extractor.components) } @@ -99,33 +99,40 @@ impl<'a> PyExtractor<'a> { } } - fn visit(&mut self, node: Node<'a>) { - match node.kind() { - "decorated_definition" => { - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - if child.kind() == "decorator" { - self.collect_decorator(child); - } else { - self.visit(child); + /// Depth-first via an explicit stack: AST depth is input-controlled + /// (deeply nested expressions), and extraction runs on small + /// worker-thread stacks. + fn run(&mut self, root: Node<'a>) { + let mut stack = vec![root]; + while let Some(node) = stack.pop() { + match node.kind() { + "decorated_definition" => { + // decorators collect now; the definition pops next + let mut cursor = node.walk(); + let mut deferred: Vec> = Vec::new(); + for child in node.named_children(&mut cursor) { + if child.kind() == "decorator" { + self.collect_decorator(child); + } else { + deferred.push(child); + } } + stack.extend(deferred.into_iter().rev()); } - } - "function_definition" => self.handle_function(node), - "class_definition" => self.handle_class(node), - "expression_statement" => { - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - if child.kind() == "assignment" { - self.handle_assignment(child); + "function_definition" => self.handle_function(node, &mut stack), + "class_definition" => self.handle_class(node, &mut stack), + "expression_statement" => { + let mut cursor = node.walk(); + for child in node.named_children(&mut cursor) { + if child.kind() == "assignment" { + self.handle_assignment(child); + } } } - } - // expressions cannot contain statements; recursion elsewhere is safe - _ => { - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - self.visit(child); + _ => { + let mut cursor = node.walk(); + let children: Vec> = node.named_children(&mut cursor).collect(); + stack.extend(children.into_iter().rev()); } } } @@ -153,7 +160,7 @@ impl<'a> PyExtractor<'a> { .push(self.content[indent_start..node.end_byte()].to_string()); } - fn handle_function(&mut self, node: Node<'a>) { + fn handle_function(&mut self, node: Node<'a>, stack: &mut Vec>) { let body = node.child_by_field_name("body"); let is_async = node.child(0).map(|c| c.kind() == "async").unwrap_or(false); if !is_async { @@ -163,7 +170,7 @@ impl<'a> PyExtractor<'a> { } } if let Some(body) = body { - self.visit(body); + stack.push(body); } } @@ -208,14 +215,14 @@ impl<'a> PyExtractor<'a> { Some(cleaned) } - fn handle_class(&mut self, node: Node<'a>) { + fn handle_class(&mut self, node: Node<'a>, stack: &mut Vec>) { let body = node.child_by_field_name("body"); if let Some(component) = self.build_class_signature(node) { self.emit_with_decorators(component); self.in_class = true; } if let Some(body) = body { - self.visit(body); + stack.push(body); } } diff --git a/crates/tree_plus_core/src/extract/treesitter/rust.rs b/crates/tree_plus_core/src/extract/treesitter/rust.rs index 3befc1d..d83006a 100644 --- a/crates/tree_plus_core/src/extract/treesitter/rust.rs +++ b/crates/tree_plus_core/src/extract/treesitter/rust.rs @@ -100,55 +100,58 @@ pub fn extract(content: &str, syntax: bool) -> ExtractResult { Ok(components) } -fn visit(node: Node<'_>, content: &str, syntax: bool, out: &mut Vec) { - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - match child.kind() { +/// Depth-first via an explicit stack: AST depth is input-controlled, and +/// deep expression nesting must not overflow small worker-thread stacks. +fn visit(root: Node<'_>, content: &str, syntax: bool, out: &mut Vec) { + let mut stack = vec![root]; + while let Some(node) = stack.pop() { + match node.kind() { "function_item" | "function_signature_item" => { - if let Some(c) = format_fn(child, content) { + if let Some(c) = format_fn(node, content) { out.push(c); } // nested items inside fn bodies still match - visit(child, content, syntax, out); + descend(node, &mut stack); } "struct_item" => { - if let Some(c) = format_struct_impl(child, content, false) { + if let Some(c) = format_struct_impl(node, content, false) { out.push(c); } } "impl_item" => { - if let Some(c) = format_struct_impl(child, content, true) { + if let Some(c) = format_struct_impl(node, content, true) { out.push(c); } - visit(child, content, syntax, out); + descend(node, &mut stack); } "enum_item" => { - if let Some(c) = format_enum(child, content, syntax) { - out.push(c); - } - } - "trait_item" => { - if let Some(c) = format_trait_mod(child, content) { + if let Some(c) = format_enum(node, content, syntax) { out.push(c); } - visit(child, content, syntax, out); } - "mod_item" => { - if let Some(c) = format_trait_mod(child, content) { + "trait_item" | "mod_item" => { + if let Some(c) = format_trait_mod(node, content) { out.push(c); } - visit(child, content, syntax, out); + descend(node, &mut stack); } "macro_definition" => { - if let Some(c) = format_macro(child, content) { + if let Some(c) = format_macro(node, content) { out.push(c); } } - _ => visit(child, content, syntax, out), + _ => descend(node, &mut stack), } } } +/// Push `node`'s named children so they pop in source order. +fn descend<'t>(node: Node<'t>, stack: &mut Vec>) { + let mut cursor = node.walk(); + let children: Vec> = node.named_children(&mut cursor).collect(); + stack.extend(children.into_iter().rev()); +} + /// Start of the line containing `byte` (offset just after the previous `\n`). fn line_start(content: &str, byte: usize) -> usize { content[..byte].rfind('\n').map(|i| i + 1).unwrap_or(0) diff --git a/crates/tree_plus_core/src/extract/treesitter/typescript.rs b/crates/tree_plus_core/src/extract/treesitter/typescript.rs index 55dd887..dfd76ee 100644 --- a/crates/tree_plus_core/src/extract/treesitter/typescript.rs +++ b/crates/tree_plus_core/src/extract/treesitter/typescript.rs @@ -51,7 +51,7 @@ pub fn extract(content: &str, tsx: bool) -> ExtractResult { content, components: Vec::new(), }; - extractor.visit(tree.root_node()); + extractor.run(tree.root_node()); Ok(extractor.components) } @@ -60,6 +60,13 @@ struct TsExtractor<'a> { components: Vec, } +/// Push `node`'s named children so they pop in source order. +fn push_children_rev<'t>(node: Node<'t>, stack: &mut Vec>) { + let mut cursor = node.walk(); + let children: Vec> = node.named_children(&mut cursor).collect(); + stack.extend(children.into_iter().rev()); +} + fn line_start(content: &str, byte: usize) -> usize { content[..byte].rfind('\n').map(|i| i + 1).unwrap_or(0) } @@ -121,12 +128,22 @@ impl<'a> TsExtractor<'a> { .to_string() } - fn visit(&mut self, node: Node<'a>) { + /// Depth-first via an explicit stack: AST depth is input-controlled + /// (deeply nested expressions), and extraction runs on small + /// worker-thread stacks. + fn run(&mut self, root: Node<'a>) { + let mut stack = vec![root]; + while let Some(node) = stack.pop() { + self.visit(node, &mut stack); + } + } + + fn visit(&mut self, node: Node<'a>, stack: &mut Vec>) { match node.kind() { "class_declaration" | "abstract_class_declaration" => { self.emit_class_like(node); if let Some(body) = node.child_by_field_name("body") { - self.visit(body); + stack.push(body); } } "interface_declaration" => { @@ -144,7 +161,7 @@ impl<'a> TsExtractor<'a> { "function_declaration" | "generator_function_declaration" => { self.emit_function(node); if let Some(body) = node.child_by_field_name("body") { - self.visit(body); + stack.push(body); } } "function_expression" | "generator_function" => { @@ -152,13 +169,13 @@ impl<'a> TsExtractor<'a> { self.emit_function(node); } if let Some(body) = node.child_by_field_name("body") { - self.visit(body); + stack.push(body); } } "method_definition" | "method_signature" | "abstract_method_signature" => { self.emit_signature_to_params_or_return(node); if let Some(body) = node.child_by_field_name("body") { - self.visit(body); + stack.push(body); } } "arrow_function" => { @@ -166,7 +183,7 @@ impl<'a> TsExtractor<'a> { self.emit_arrow(node); } if let Some(body) = node.child_by_field_name("body") { - self.visit(body); + stack.push(body); } } "expression_statement" => { @@ -177,10 +194,7 @@ impl<'a> TsExtractor<'a> { { self.maybe_emit_bare_call(node, call); } - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - self.visit(child); - } + push_children_rev(node, stack); } "variable_declarator" => { // object scope: `const name = {` when the object holds functions @@ -192,14 +206,11 @@ impl<'a> TsExtractor<'a> { self.components .push(self.content[start..=brace].trim_end().to_string()); } - self.visit(value); + stack.push(value); } } _ => { - let mut cursor = node.walk(); - for child in node.named_children(&mut cursor) { - self.visit(child); - } + push_children_rev(node, stack); } } } diff --git a/crates/tree_plus_core/src/walk.rs b/crates/tree_plus_core/src/walk.rs index c519409..c4d4db7 100644 --- a/crates/tree_plus_core/src/walk.rs +++ b/crates/tree_plus_core/src/walk.rs @@ -14,8 +14,26 @@ use crate::ignore::{ use crate::model::{Category, TreePlus}; use crate::sort::os_sort_key; +/// Dedicated rayon pool with large worker stacks. Work-stealing nests +/// `from_folder` frames on a single worker while it waits in `join`, and +/// tree-sitter's C parser needs headroom below that; rayon's default 2 MiB +/// worker stacks segfault on big trees (e.g. torvalds/linux `drivers/`). +fn pool() -> &'static rayon::ThreadPool { + static POOL: std::sync::LazyLock = std::sync::LazyLock::new(|| { + rayon::ThreadPoolBuilder::new() + .stack_size(16 * 1024 * 1024) + .build() + .expect("build worker pool") + }); + &POOL +} + /// Build a `TreePlus` from seed path/glob strings (legacy `from_seeds`). pub fn from_seeds(seeds: &[String], config: &TreePlusConfig) -> TreePlus { + pool().install(|| from_seeds_inner(seeds, config)) +} + +fn from_seeds_inner(seeds: &[String], config: &TreePlusConfig) -> TreePlus { let mut seeds: Vec = if seeds.is_empty() { vec![std::env::current_dir() .map(|p| p.to_string_lossy().into_owned()) diff --git a/crates/tree_plus_core/tests/robustness.rs b/crates/tree_plus_core/tests/robustness.rs index 79bb957..64e42dd 100644 --- a/crates/tree_plus_core/tests/robustness.rs +++ b/crates/tree_plus_core/tests/robustness.rs @@ -84,3 +84,60 @@ fn invalid_utf8_yields_no_components() { let path = write_temp("invalid.py", &[0xC3, 0x28, b'\n', b'd', b'e', b'f', b' ']); assert!(extract_components(&path, false).is_empty()); } + +/// Deep ASTs must not overflow small (rayon-sized) thread stacks. +/// +/// Regression: arch/x86/kernel/cpu/microcode/intel-ucode-defs.h in +/// torvalds/linux is a headerless initializer-list fragment; tree-sitter +/// parses it as a deeply nested ERROR tree, and the recursive extractor +/// walk aborted with a stack overflow inside the rayon pool. +#[test] +fn deep_nesting_never_overflows_worker_stacks() { + let cases: Vec<(&str, String)> = vec![ + ( + "h", + "{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = 1, .family = 0x6 },\n".repeat(5_000), + ), + ( + "c", + format!("int x = {}1{};\n", "(".repeat(20_000), ")".repeat(20_000)), + ), + ( + "py", + format!("x = {}{}\n", "[".repeat(20_000), "]".repeat(20_000)), + ), + ( + "rs", + format!( + "fn f() {{ let x = {}1{}; }}\n", + "(".repeat(20_000), + ")".repeat(20_000) + ), + ), + ( + "ts", + format!("const x = {}{};\n", "[".repeat(20_000), "]".repeat(20_000)), + ), + ( + "go", + format!( + "func f() {{\n\tx := {}1{}\n}}\n", + "(".repeat(20_000), + ")".repeat(20_000) + ), + ), + ]; + for (ext, content) in cases { + let path = write_temp(&format!("deep.{ext}"), content.as_bytes()); + // rayon workers default to 2 MiB stacks; use a harsher 512 KiB + let handle = std::thread::Builder::new() + .stack_size(512 * 1024) + .spawn(move || { + let _ = extract_components(&path, false); // must not abort + }) + .unwrap(); + handle + .join() + .unwrap_or_else(|_| panic!("deep .{ext} input crashed the extractor")); + } +} diff --git a/docs/performance.md b/docs/performance.md index b273e28..623574f 100644 --- a/docs/performance.md +++ b/docs/performance.md @@ -31,9 +31,35 @@ hyperfine './target/release/tree_plus /path/to/repo > /dev/null' \ 'python tree_plus_cli.py /path/to/repo > /dev/null' ``` -A Linux-kernel-sized tree was not available locally; the command above is -the reproducible benchmark for one (also: -`TREE_PLUS_BENCH_PATH=/path/to/linux cargo bench -p tree_plus_core`). +## Linux kernel (torvalds/linux @ ~6.15, 101,136 files, 44.77M lines) + +Single runs (`/usr/bin/time -v`), page cache warm: + +| Invocation | Rust wall | Peak RSS | Notes | +|---|---|---|---| +| `tprs -c linux` | 0.50 s | 178 MB | 6,285 folders, 101,136 files, 44.77M lines, 409.2M tokens | +| `tprs linux` | 12.4 s | 1.5 GB | full extraction; 8.42M-line render (598 MB of output) | +| `tprs linux/kernel` | 0.23 s | — | 711 files, 553,855 lines; Python: 12.6 s → **~56×** | + +Counts (lines, tokens) agree byte-for-byte with the legacy Python CLI on +`linux/kernel`. Peak RSS on the full run is dominated by the final render +string (~600 MB) plus the assembled tree, not by extraction. + +The kernel run also surfaced two stack-overflow bugs, both fixed: + +1. Extractor visitors recursed on AST depth. + `arch/x86/kernel/cpu/microcode/intel-ucode-defs.h` is a headerless + initializer-list fragment that tree-sitter parses as a deeply nested + ERROR tree; the recursive walk blew rayon's 2 MiB worker stacks. All + tree-sitter extractors now traverse with explicit heap stacks + (regression test: `deep_nesting_never_overflows_worker_stacks`). +2. Rayon work-stealing nests `from_folder` frames on a worker while it + waits in `join`, leaving too little headroom for tree-sitter's C + frames on big trees (`drivers/` segfaulted). Extraction now runs in a + dedicated pool with 16 MiB worker stacks. + +Also reproducible via +`TREE_PLUS_BENCH_PATH=/path/to/linux cargo bench -p tree_plus_core`. ## Single-file extraction latency (criterion, mean) From 99c3117968c89c8ac141c22d2674f24eca3c07e9 Mon Sep 17 00:00:00 2001 From: Bion Howard Date: Tue, 9 Jun 2026 18:25:13 -0400 Subject: [PATCH 8/8] docs: record full Linux-kernel Python baseline (10m02s / 6.7GB vs 12.4s / 1.5GB) Co-Authored-By: Claude Fable 5 --- docs/performance.md | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/docs/performance.md b/docs/performance.md index 623574f..4348648 100644 --- a/docs/performance.md +++ b/docs/performance.md @@ -41,9 +41,17 @@ Single runs (`/usr/bin/time -v`), page cache warm: | `tprs linux` | 12.4 s | 1.5 GB | full extraction; 8.42M-line render (598 MB of output) | | `tprs linux/kernel` | 0.23 s | — | 711 files, 553,855 lines; Python: 12.6 s → **~56×** | -Counts (lines, tokens) agree byte-for-byte with the legacy Python CLI on -`linux/kernel`. Peak RSS on the full run is dominated by the final render -string (~600 MB) plus the assembled tree, not by extraction. +The legacy Python CLI on the full tree: **10 min 2 s, 6.7 GB RSS** — and +its render timed out internally, so it printed only the stats footer, +while the Rust run includes the full 8.42M-line render. End-to-end that +is **~48×** faster with ~4.5× less memory. + +Counts agree byte-for-byte with the legacy CLI on `linux/kernel`. On the +full tree they drift by 0.0005% (Python +208 lines, Rust +6,373 tokens of +409M) from a handful of invalid-UTF-8 kernel fixtures, consistent with +the documented decoding differences. Peak RSS on the Rust run is +dominated by the final render string (~600 MB) plus the assembled tree, +not by extraction. The kernel run also surfaced two stack-overflow bugs, both fixed: