I just upgraded to v1.2.0 of this library. With the upgraded stuff on BagIt v1.0 in #118, a new bug has entered the library:
When using the BagLinter, the warning different_case will always be shown. Given it's description:
The bag contains two files that differ only in case. This can cause problems on a filesystem like the one used by apple (HFS).
I would not expect this to happen on a bag with only very distinct files.
I think the bug is as follows: in the code below a manifest file is read and for each line the path is added to the Set paths (after being converted to lowercase).
|
try(final BufferedReader reader = Files.newBufferedReader(manifestFile, encoding)){ |
|
final Set<String> paths = new HashSet<>(); |
|
|
|
String line = reader.readLine(); |
|
while(line != null){ |
|
String path = parsePath(line); |
|
|
|
path = checkForManifestCreatedWithMD5SumTools(path, warnings, warningsToIgnore); |
|
paths.add(path.toLowerCase()); |
|
|
|
checkForDifferentCase(path, paths, manifestFile, warnings, warningsToIgnore); |
|
if(encoding.name().startsWith("UTF")){ |
|
checkNormalization(path, manifestFile.getParent(), warnings, warningsToIgnore); |
|
} |
|
checkForBagWithinBag(line, warnings, warningsToIgnore, isPayloadManifest); |
|
checkForRelativePaths(line, warnings, warningsToIgnore, manifestFile); |
|
checkForOSSpecificFiles(line, warnings, warningsToIgnore, manifestFile); |
|
|
|
line = reader.readLine(); |
|
} |
|
} |
Immediately after that checkForDifferentCase is called, which checks whether paths.contains(path.toLowerCase()). Of course, this is always true, since path was just added to paths before this check.
|
/* |
|
* Check that the same line doesn't already exist in the set of paths |
|
*/ |
|
private static void checkForDifferentCase(final String path, final Set<String> paths, final Path manifestFile, |
|
final Set<BagitWarning> warnings, final Collection<BagitWarning> warningsToIgnore){ |
|
if(!warningsToIgnore.contains(BagitWarning.DIFFERENT_CASE) && paths.contains(path.toLowerCase())){ |
|
logger.warn(messages.getString("different_case_warning"), manifestFile, path); |
|
warnings.add(BagitWarning.DIFFERENT_CASE); |
|
} |
|
} |
I just upgraded to v1.2.0 of this library. With the upgraded stuff on BagIt v1.0 in #118, a new bug has entered the library:
When using the
BagLinter, the warningdifferent_casewill always be shown. Given it's description:I would not expect this to happen on a bag with only very distinct files.
I think the bug is as follows: in the code below a manifest file is read and for each line the
pathis added to the Setpaths(after being converted to lowercase).bagit-java/src/main/java/gov/loc/repository/bagit/conformance/ManifestChecker.java
Lines 119 to 139 in 97a6770
Immediately after that
checkForDifferentCaseis called, which checks whetherpaths.contains(path.toLowerCase()). Of course, this is always true, sincepathwas just added topathsbefore this check.bagit-java/src/main/java/gov/loc/repository/bagit/conformance/ManifestChecker.java
Lines 174 to 183 in 97a6770