11# Research Software MetaCheck (a Pitfall/Warning Detection Tool)
22
33This project provides an automated tool for detecting common metadata quality issues (pitfalls & Warnings)
4- in software repositories. The tool analyzes SoMEF (Software Metadata Extraction Framework) output
4+ in software repositories. The tool analyzes SoMEF (Software Metadata Extraction Framework) output
55files to identify various problems in repository metadata
66files such as ` codemeta.json ` , ` package.json ` , ` setup.py ` , ` DESCRIPTION ` , and others.
77
88## Overview
99
10- MetaCheck identifies ** 27 different types of metadata quality issues** across multiple programming languages
11- (Python, Java, C++, C, R, Rust). These pitfalls range from version mismatches and
10+ MetaCheck identifies ** 29 different types of metadata quality issues** across multiple programming languages
11+ (Python, Java, C++, C, R, Rust). These pitfalls range from version mismatches and
1212license template placeholders to broken URLs and improperly formatted metadata fields.
1313
1414### Supported Pitfall Types
@@ -37,12 +37,14 @@ The tool detects the following categories of issues:
3737### Using Poetry (Recommended)
3838
39391 . ** Clone the repository** :
40+
4041 ``` bash
4142 git clone https://github.com/Anas-Elhounsri/RsMetaCheck.git
4243 cd RsMetaCheck
4344 ```
4445
45462 . ** Install with Poetry** :
47+
4648 ``` bash
4749 poetry install
4850 ```
@@ -56,6 +58,7 @@ The tool detects the following categories of issues:
5658### Using pip
5759
5860Alternatively, you can install directly from GitHub:
61+
5962``` bash
6063pip install git+https://github.com/Anas-Elhounsri/RsMetaCheck.git
6164```
@@ -75,7 +78,7 @@ poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse
7578``` bash
7679poetry run rsmetacheck --input repositories.json
7780```
78-
81+
7982The ` repositories.json ` file should be structured as follows:
8083
8184``` json
@@ -109,28 +112,30 @@ Or for multiple paths:
109112``` bash
110113poetry run rsmetacheck --skip-somef --input my_somef_outputs_1/* .json my_somef_outputs_2/* .json
111114```
115+
112116### Output
113117
114118The tool will:
119+
115120- Process all JSON files in the ` somef_outputs ` (by default created by the tool) directory
116121- Display progress messages showing detected pitfalls
117- - Generate JSON-LD files of detailed Pitfalls and Warnings detected by the tool in ` output_1_pitfalls.jsonld ` ,
118- ` output_2_pitfalls.jsonld ` , etc... in ` pitfalls ` (by default created by the tool) directory
122+ - Generate JSON-LD files of detailed Pitfalls and Warnings detected by the tool in ` output_1_pitfalls.jsonld ` ,
123+ ` output_2_pitfalls.jsonld ` , etc... in ` pitfalls ` (by default created by the tool) directory
119124- Generate a comprehensive report in ` all_pitfalls_results.json `
120125
121126The output file contains:
127+
122128- EVERSE standardized JSON-LD output of each repository
123129- Summary statistics of analyzed repositories
124130- Count and percentage for each pitfall type
125131- Language-specific breakdown for repositories with target languages
126132
127-
128133## Troubleshooting
129134
130135### Common Issues
131136
132- 1 . ** "There is no valid repository URL" error** : Ensure the JSON file that contains the repositories
133- has a valid structure and that you are inputing the correct path
137+ 1 . ** "There is no valid repository URL" error** : Ensure the JSON file that contains the repositories
138+ has a valid structure and that you are inputing the correct path
1341392 . ** Network timeouts** : Some pitfalls validate URLs and may time out this is normal behavior
135140
136141### Performance Notes
@@ -141,6 +146,6 @@ has a valid structure and that you are inputing the correct path
141146
142147## Contributing
143148
144- The system is designed with modularity in mind. Each pitfall detector is implemented as a
145- separate module in the ` scripts/ ` directory, making it easy to add new pitfall types or modify
149+ The system is designed with modularity in mind. Each pitfall detector is implemented as a
150+ separate module in the ` scripts/ ` directory, making it easy to add new pitfall types or modify
146151existing detection logic.
0 commit comments