You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+10Lines changed: 10 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,16 @@
2
2
3
3
All changes that impact users of this module are documented in this file, in the [Common Changelog](https://common-changelog.org) format with some additional specifications defined in the CONTRIBUTING file. This codebase adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
4
4
5
+
## Unreleased [minor]
6
+
7
+
> Development of this release was supported by the [European Commission](https://commission.europa.eu/) for its [VLOPs/VLOSEs instance](https://code.europa.eu/dsa/terms-and-conditions-database/vlops-and-vloses/).
8
+
9
+
### Added
10
+
11
+
- Add proxy support for fetching documents behind firewalls or restricted networks; configure using `HTTP_PROXY` and `HTTPS_PROXY` (or `http_proxy` and `https_proxy`) environment variables
12
+
- Add debugging options to disable headless mode for visual troubleshooting during development; set `OTA_ENGINE_FETCHER_NO_HEADLESS=1` to show browser window
13
+
- Add sandbox control for improved compatibility with Docker and containerized environments; set `OTA_ENGINE_FETCHER_NO_SANDBOX=1` when running in containers
14
+
5
15
## 9.1.2 - 2025-10-30
6
16
7
17
_Full changeset and discussions: [#1199](https://github.com/OpenTermsArchive/engine/pull/1199)._
Copy file name to clipboardExpand all lines: CONTRIBUTING.md
+36-1Lines changed: 36 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,7 @@ First of all, thanks for taking the time to contribute! 🎉👍
8
8
-[Commit messages](#commit-messages)
9
9
-[Changelog](#changelog)
10
10
-[Development](#development)
11
+
-[Configuration and environment variables](#configuration-and-environment-variables)
11
12
-[Documentation](#documentation)
12
13
-[Naming](#naming)
13
14
-[Instances and repositories](#instances-and-repositories)
@@ -75,7 +76,7 @@ Changes that require an adjustment in the infrastructure, they are considered as
75
76
76
77
4. Since each release is produced automatically from a single pull request, the [notice](https://common-changelog.org/#23-notice) links to the source pull request rather than [references](https://common-changelog.org/#242-references), which would always reference the same pull request. References can link to relevant parts of an RFC, decision record, or diff. **This notice is automatically generated by the CI during the release process and should not be added manually.**
77
78
78
-
5. The [notice](https://common-changelog.org/#23-notice) is also used to present sponsor information and it is required. Since the development of this project is funded by different actors, and following discussions with sponsors, financial contributions are acknowledged in the changelog itself. The format of the notice thus diverges from the Common Changelog specification in that it is not “a single-sentence paragraph”. Sponsor information is in quote format, starts with “Development of this release was supported by <funding_from>”, and provides the name and link to the sponsor, as well as information on the specific funding instrument, as specified by the sponsor itself or as required by law. A short message from the sponsor might also be added, as long as it abides by the community’s [Code of Conduct](./CODE_OF_CONDUCT.md) and aligns with the project’s goals. For volunteer contributions, the sentence should start with: “Development of this release was made on a volunteer basis by <contributor_name>”
79
+
5. The [notice](https://common-changelog.org/#23-notice) is also used to present sponsor information and it is required. Since the development of this project is funded by different actors, as a matter of transparency and recognition, financial contributions and contributions supported by employers are acknowledged in the changelog itself. The format of the notice thus diverges from the Common Changelog specification in that it is not “a single-sentence paragraph”. Sponsor information is in quote format, starts with “Development of this release was supported by <funding_from>”, and provides the name and link to the sponsor, as well as information on the specific funding instrument, as specified by the sponsor itself or as required by law. A short message from the sponsor might also be added, as long as it abides by the community’s [Code of Conduct](./CODE_OF_CONDUCT.md) and aligns with the project’s goals. For volunteer contributions, the sentence should start with: “Development of this release was made on a volunteer basis by <contributor_name>”
79
80
80
81
#### Changes that do not impact users
81
82
@@ -91,6 +92,40 @@ This content will be automatically deleted by the CI after merging.
91
92
92
93
## Development
93
94
95
+
### Configuration and environment variables
96
+
97
+
The choice between environment variables and configuration files should be made based on the nature of the data and how it will be used.
98
+
99
+
**Use environment variables for:**
100
+
101
+
- Secrets: API keys, passwords, tokens, or any sensitive data that should not be committed to version control. Examples:
102
+
-`OTA_ENGINE_GITHUB_TOKEN`: GitHub API token for creating issues and managing repositories
103
+
-`OTA_ENGINE_SMTP_PASSWORD`: password for SMTP server authentication
104
+
- Debugging flags: toggles for development features. Examples:
105
+
-`OTA_ENGINE_FETCHER_NO_HEADLESS`: disables headless mode in Puppeteer to show the browser window during fetching
106
+
- Unix standards: system-level settings following Unix conventions. Examples:
107
+
-`HTTP_PROXY`, `HTTPS_PROXY`, `http_proxy`, `https_proxy`: proxy server configuration for HTTP/HTTPS requests
108
+
- Runtime overrides: container-specific or deployment-specific settings that vary between environments. Examples:
109
+
-`OTA_ENGINE_FETCHER_NO_SANDBOX`: disables Chrome sandbox (required in some Docker environments)
110
+
111
+
**Use configuration files for:**
112
+
113
+
- Engine behavior: Core functionality settings that define how the application operates. Examples:
114
+
-`trackingSchedule`: Cron expression defining when to track terms (e.g., `"30 */12 * * *"` for every 12 hours)
115
+
-`fetcher.language`: Language code for Accept-Language header in HTTP requests
116
+
- Service settings: External service endpoints and integration parameters. Examples:
117
+
-`versionsRepositoryURL`: URL of the GitHub repository storing document versions
118
+
-`logger.smtp.host`: SMTP server hostname for sending error notifications
119
+
- Static infrastructure: Deployment-independent paths and identifiers. Examples:
120
+
-`recorder.versions.storage.git.path`: File system path where Git repository for versions is stored
121
+
-`recorder.versions.storage.git.author`: Git commit author name and email for automated commits
122
+
123
+
When uncertain whether to use an environment variable or a configuration file, consider:
124
+
125
+
- Does it contain sensitive information? → Environment variable
126
+
- Should it be version-controlled and reviewed? → Configuration file
127
+
- Is it a stable setting that defines application behavior? → Configuration file
awaitclient.send('Network.clearBrowserCookies');// Clear cookies to ensure clean state between fetches and prevent session persistence across different URLs
response=awaitpage.goto(url,{waitUntil: 'load'});// Using `load` instead of `networkidle0` as it's more reliable and faster. The 'load' event fires when the page and all its resources (stylesheets, scripts, images) have finished loading. `networkidle0` can be problematic as it waits for 500ms of network inactivity, which may never occur on dynamic pages and then triggers a navigation timeout.
29
35
30
36
if(!response){
@@ -86,7 +92,34 @@ export async function launchHeadlessBrowser() {
0 commit comments