- Composer update.
- Add identifier blacklist for NGR source catalog.
- Remove Nijmegen source catalog from DonlSync.
- Fix Stelselcatalogus source catalog so that data schemas are not stored as a new distribution when synced.
- Downgraded the
composer.lockfile to satisfy the PHP 7.4 minimum requirement.
- Updated the API endpoint for the Nijmegen source catalog.
- Dropped the unique index on the UnmappedValues table as text columns cannot be part of an index for some RDBMS's. This constraint was already enforced throughout the codebase, so it introduces no functional difference.
- The
dataset.relatedResourceproperty harvested from the Eindhoven source catalog is now processed by theStringHelper::repairURLmethod before it is offered to the dataset builder. - If a string given to the
StringHelper::repairURLmethod contains a|character, then everything up to and including that character is removed from the string before performing any other repairs.
- The Stelselcatalogus (SC) source catalog is now included in the manual and scheduled executions of DonlSync.
- Updated the Stelselcatalogus source catalog to only harvest datasets which have at least 1 distribution.
- Updated the Nijmegen and Stelselcatalogus source catalogs to repair any harvested
distribution.accessURLproperties before the datasets are offered to the DatasetBuilder. - Updated the
StringHelper::repairURLmethod so that it can repair several new cases of bad URLs. - Updated the
.env.distfile to include the CKAN credentials for the Stelselcatalogus source catalog. This appears to have been omitted in version4.5.0. - DonlSync will now register missing mappings during the harvesting process. A mapping is considered missing when the harvested value is equal to the effective value after applying all the mappings and the effective value is not valid according to the DCAT validation model. The missing mappings are included in the
ZIParchive of the scheduled execution in a{source catalog}__unmapped__{date}.logfile. Entries in this file are formatted as{object}, {attribute}, {value}.
Note This version requires an update to the DonlSync database. Run php DonlSync InstallDatabase to perform these database updates.
- Fix for Eindhoven source catalog for when a dataschema distribution was being created for a dataset without explicit language information.
- Expanded the CKAN keyword transformer to additionally strip any forward slashes from the keyword.
- Changed the restart policy of the local
docker-compose.ymlto "no".
- Convert HTML to MarkDown in
description(in dataset and distributions) fields in Eindhoven datasets.
- Change implementation of determination of
dataschemadistributions. - Let Stelsel Catalogus (SC) use its own defaults mapping.
- Add the "pseudo-harvesting" of Stelsel Catalogus (SC) datasets. This means that existing SC datasets are harvested and corresponding begrippen (concepts) and gegevenselementen (data schemas) are harvested/updated while doing so.
- Harvesting the bounding box metadata from NGR can now be enabled/disabled via a key in the
catalog_NGR.jsonconfiguration file. - Updated the endpoint of the Nijmegen catalog API and updated the query for retrieving the Nijmegen datasets from the API.
- Added value list table for attributes in
FeatureCatalogueof NGR datasets.
- Added support for
Datasetschemaof Eindhoven datasets. ADatasetschemaof an Eindhoven dataset describes the content of a dataset. It is transformed to a distribution.
- Added support for
FeatureCatalogueof NGR datasets. AFeatureCatalogueof an NGR dataset describes the content of a dataset. It is transformed to a distribution.
- Disabled PersistentProperties of the
DONLTargetCatalog. This feature is currently bugged and prevents certain resources from being sent to CKAN. - The NGR source catalog will now harvest graphics as DCAT Distributions with type Visualization.
- Moved the NGR method to repair URLs to an application-wide class so that it can be reused for other source catalogs. The new method includes several new edge-cases to repair and is backed by several unit tests.
- Support storing the computed metadata checksum in CKAN for later retrieval.
- Minimum PHP version raised to
7.4. Several parts of the codebase were updated to use the new features introduced in this PHP version, such as class property type-hinting. - Introduced support for
PHP 8.0. - Included a new source catalog 'Eindhoven'. This source catalog harvests the data.eindhoven.nl catalog.
- Refactored all database interactions such that different RDBM's can be used. There is no longer a hard requirement on MySQL.
- The PHPDoc of class properties were expanded to include a description of the property.
- Updated the various Composer dependencies.
- Included a
docker-compose.ymlfile for local development. - All shell scripts were moved the the
./bindirectory. - The
Applicationgod-object now has an accompanying interfaceApplicationInterface. - Optimized the procedure for comparing the harvested dataset to the dataset on the catalog. Only a single
package_showAPI call will now be used for multiple comparions (persistent_propertiesandresource.id checks). - The
NGRSourceCatalogis now capable of harvesting geo and temporal metadata from the NGR source catalog. - The fallback license has been updated to
licentieonbekendto better describe the understanding of the license metadata.
- Added Docker support.
- Updated several Composer dependencies.
- Set executable bits to the
*.shfiles in theshell/directory.
- Updated Gitlab CI pipeline to perform level 6 static analysis. All code is now expected to pass level 6.
- Updated typehints of the various
*BuildRuleclasses to ensure proper type-hinting throughout the application. - Updated several
arraytypehints to more accurately describe the contents of saidarray. - Reduced code duplication in the various
*BuildRuleimplementations. - Renamed
DCATDistributionBuildRuletoDONLDistributionBuildRuleas to properly indicate which type ofDCATEntityis being built by theBuildRule. - Explicit
inttostringconversion while writing output.
- Fixed a bug that prevented the recognition of pre-existing resources.
- Updated
Composerdependencies. - Increased UnitTest coverage of:
DonlSync\Catalog\Target\DONL
- The output of
Application::version()is now only computed once per execution. SendLogsCommandnow throws aDonlSyncRuntimeExceptionwhen adding a recipient fails.DateTimernow throws aDonlSyncRuntimeExceptionin the following cases:- When trying to end a timer that hasn't started yet.
- When the
DateTimeFormatis invalid. - When the given configuration is missing keys.
- Several typehints have been updated to indicate that they can contain
nullvalues. DONLTargetCatalognow throws aDonlSyncRuntimeExceptionwhen not all credentials are provided to methods that require credentials.- Updated
Composerdependencies. - Updated Gitlab CI pipeline to include several analysis tools.
- Increased UnitTest coverage of:
DonlSync\HelperDonlSync
- Removed unnecessary
file_exists()call inConfiguration::createFromJSONFile()asis_readable()already covers that case. DateTimernow throwsDonlSyncRuntimeExceptions on anyDateTimerelated error.- Increased UnitTest coverage of:
DonlSync
- Fixed a bug that prevented an empty JSON list from being accepted by
MappingLoader::loadJSONContentsFromURL(). BuilderConfiguration::getDefaults()can now properly returnnullwhen noDefaultMapperis assigned.- Updated several
composerdependencies. - Updated PHP-CS-Fixer config to include
test/. - Updated
phpunit.xml.distto PHPUnit 9.3's XSD. - Increased UnitTest coverage of:
DonlSync\CommandDonlSync\DatasetDonlSync\Dataset\BuilderDonlSync\Dataset\MappingDonlSync\Helper
- Added Gitlab CI integration.
- The
Summarizernow maintains and stores a summary on a per catalog basis. - The email template will now display the import summary per catalog rather than only the total summary.
- The SendLogs command will now throw a
DonlSyncRuntimeExceptionif the summary file cannot be found.
- Slight tuning to the API call used to retrieve all the dataset IDs from the NGR catalog. Results are now explicitly sorted and facets are no longer included in the response.
- Updated README.md to include requirements for the configured MySQL user.
- Fixed shell scripts for systems which do not have
TMPDIRdefined as an environment variable. - Removed catalog URI from ExecutionMessage format.
- Dataset Identifier conflicts are now registered as an ExecutionMessage of the import. They will now be shown as part of the daily email summary as a result.
- Introduced database patch file to upgrade from version
2.0to version3.x. - Added shell script
log_cleanup.shwhich will periodically cleanup old logs from the scheduled executions and updated the installation instructions to include this new script.
- Introduced
CHANGELOG.mdto track changes between versions. - Introduced
CONTRIBUTING.mdto provide guidelines on how to contribute.
- Dataset rejections by the target catalog are included in the daily email summary.
- Increased minimum PHP version to
7.3. - Updated several
composerdependencies to newer versions. - Updated codestyle guidelines and enforced them throughout the entire codebase.
- Introduced shell scripts to execute DonlSync manually and in a scheduled manner. The old shell scripts specific for each environment were removed as they are now obsolete.
- Moved all 'sensitive' configuration values to
.envand ensured Git will not track this file. A.env.distfile is tracked which contains the placeholder values for the.envfile. - Introduced a
post-install-cmdscript to generate all the files (and directories) which are not tracked by Git. - Updated email summary to include more data about the daily execution. It now includes the total number of processed datasets.
- Email summary updated to a HTML message. The message body is now generated using the PHP Blade engine (https://github.com/jenssegers/blade).
- Removed support for multiple environments. The
.envfile now dictates the environment used. The environment dictated in the.envfile represents the environment of DonlSync itself and the environment of the target catalog. These environments should always be identical. In order to target more than 1 environment it is now necessary to install this project multiple times (1 for each environment). - Merged the codebases for the CBS and CBSDerden catalog. These catalogs are ran by the same software. As such, a single
ISourceCatalogimplementation calledODataCatalognow represents both catalogs. This implementation now serves either CBS or CBSDerden based on the configuration injected. - CBS and CBSDerden
Distribution.AccessURLproperties are now based on a custom buildrule which generates the appropriate value based on the mappings provided by CBS. The 'old'AccessURLhas been moved to theDownloadURLproperty of the sameDistribution. - Moved all ODataCatalog XPath selectors to the configuration file of the source catalog. It is now possible to define multiple XPath selectors for a single property. The
ODataMetadataExtractorwill try these selectors one by one until it encounters a non-empty value. - Moved all NGR XPath selectors to the configuration file of the NGR source catalog. It is now possible to define multiple XPath selectors for a single property. The
NGXMLMetadataExtractorwill try these selectors one by one until it encounters a non-empty value. - Updated and introduced several XPath selectors for source catalog NGR to better support the several metadata profiles currently in use on https://nationaalgeoregister.nl.
- Moved all mapping files to a Github repository (https://github.com/dataoverheid/donlsync-mappings) and updated all references to these files.
- Moved all default value settings to an online mapping file which is hosted on Github (https://github.com/dataoverheid/donlsync-mappings).
- Default values are now managed via the
DefaultMapperwhich is another implementation of theAbstractMapper. All DCAT buildrules are updated to use this new class. - Removed the
AppGlobalsclass and moved its functionality into several configuration files. - DonlSync now enforces a maximum number of Distributions which may be present for any given dataset. This maximum is defined in the
.envfile. If a dataset has more than the configured maximum only the first N will be synchronized, where N is the configured maximum amount of Distributions. - Introduced a global application container that acts as a dependency repository for the rest of the application.
- Removed all
Controllers. All command logic is now part theCommandbeing executed. - Updated the database table schema of
ProcessedDatasetandExecutionMessage. Theenvironmentcolumn is no longer applicable and has been removed. Several SQL indices were introduced for these tables. - The database table repository classes now have a
createTablemethod used to create the database tables they represent. - Introduced a command
php DonlSync InstallDatabaseto create all required database tables. - Refactored several buildrule implementations to reduce code duplication.
No data available.
No data available.