Skip to content

Commit 21e2d6e

Browse files
Updated number of repos in README, version of tool and removed dummy data
1 parent a27027b commit 21e2d6e

4 files changed

Lines changed: 18 additions & 366 deletions

File tree

README.md

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
# Research Software MetaCheck (a Pitfall/Warning Detection Tool)
22

33
This project provides an automated tool for detecting common metadata quality issues (pitfalls & Warnings)
4-
in software repositories. The tool analyzes SoMEF (Software Metadata Extraction Framework) output
4+
in software repositories. The tool analyzes SoMEF (Software Metadata Extraction Framework) output
55
files to identify various problems in repository metadata
66
files such as `codemeta.json`, `package.json`, `setup.py`, `DESCRIPTION`, and others.
77

88
## Overview
99

10-
MetaCheck identifies **27 different types of metadata quality issues** across multiple programming languages
11-
(Python, Java, C++, C, R, Rust). These pitfalls range from version mismatches and
10+
MetaCheck identifies **29 different types of metadata quality issues** across multiple programming languages
11+
(Python, Java, C++, C, R, Rust). These pitfalls range from version mismatches and
1212
license template placeholders to broken URLs and improperly formatted metadata fields.
1313

1414
### Supported Pitfall Types
@@ -37,12 +37,14 @@ The tool detects the following categories of issues:
3737
### Using Poetry (Recommended)
3838

3939
1. **Clone the repository**:
40+
4041
```bash
4142
git clone https://github.com/Anas-Elhounsri/RsMetaCheck.git
4243
cd RsMetaCheck
4344
```
4445

4546
2. **Install with Poetry**:
47+
4648
```bash
4749
poetry install
4850
```
@@ -56,6 +58,7 @@ The tool detects the following categories of issues:
5658
### Using pip
5759

5860
Alternatively, you can install directly from GitHub:
61+
5962
```bash
6063
pip install git+https://github.com/Anas-Elhounsri/RsMetaCheck.git
6164
```
@@ -75,7 +78,7 @@ poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse
7578
```bash
7679
poetry run rsmetacheck --input repositories.json
7780
```
78-
81+
7982
The `repositories.json` file should be structured as follows:
8083

8184
```json
@@ -109,28 +112,30 @@ Or for multiple paths:
109112
```bash
110113
poetry run rsmetacheck --skip-somef --input my_somef_outputs_1/*.json my_somef_outputs_2/*.json
111114
```
115+
112116
### Output
113117

114118
The tool will:
119+
115120
- Process all JSON files in the `somef_outputs` (by default created by the tool) directory
116121
- Display progress messages showing detected pitfalls
117-
- Generate JSON-LD files of detailed Pitfalls and Warnings detected by the tool in `output_1_pitfalls.jsonld`,
118-
`output_2_pitfalls.jsonld`, etc... in `pitfalls` (by default created by the tool) directory
122+
- Generate JSON-LD files of detailed Pitfalls and Warnings detected by the tool in `output_1_pitfalls.jsonld`,
123+
`output_2_pitfalls.jsonld`, etc... in `pitfalls` (by default created by the tool) directory
119124
- Generate a comprehensive report in `all_pitfalls_results.json`
120125

121126
The output file contains:
127+
122128
- EVERSE standardized JSON-LD output of each repository
123129
- Summary statistics of analyzed repositories
124130
- Count and percentage for each pitfall type
125131
- Language-specific breakdown for repositories with target languages
126132

127-
128133
## Troubleshooting
129134

130135
### Common Issues
131136

132-
1. **"There is no valid repository URL" error**: Ensure the JSON file that contains the repositories
133-
has a valid structure and that you are inputing the correct path
137+
1. **"There is no valid repository URL" error**: Ensure the JSON file that contains the repositories
138+
has a valid structure and that you are inputing the correct path
134139
2. **Network timeouts**: Some pitfalls validate URLs and may time out this is normal behavior
135140

136141
### Performance Notes
@@ -141,6 +146,6 @@ has a valid structure and that you are inputing the correct path
141146

142147
## Contributing
143148

144-
The system is designed with modularity in mind. Each pitfall detector is implemented as a
145-
separate module in the `scripts/` directory, making it easy to add new pitfall types or modify
149+
The system is designed with modularity in mind. Each pitfall detector is implemented as a
150+
separate module in the `scripts/` directory, making it easy to add new pitfall types or modify
146151
existing detection logic.

results/pitfalls/output_1_pitfalls.jsonld

Lines changed: 0 additions & 121 deletions
This file was deleted.

0 commit comments

Comments
 (0)