With every zip file uploaded, CurseForge opens the zip, edits the lua, saves a new zip, and serves that new zip to users. This is unacceptable. Do not modify the files we send to you. Every byte is there for a reason. I understand you want to open the zip and look around, and that's fine. But if it passes your filters, you should post the file we sent you, unedited.
This was already noticed in this thread, but I have further details that go beyond line endings.
Specifically, CurseForge edits lua and toc files, changing line endings from 0x0A to 0x0D 0x0A, and replacing all bytes over 0x7F with the Unicode Replacement Character (0xEF 0xBF 0xBD).
News flash: not all strings are Unicode text! My addon stores data in lua strings in binary format to save space. A price for every item on every realm in the US and EU is already a lot of data; encoding it in non-binary strings would be very slow and inefficient.
We completely understand your frustration on this and will be investigating why this is happening.
Files should only be edited when you, the author, uses some of our advanced features like localizations or replacement tags and only those files should be touched.
Files should only be edited when you, the author, uses some of our advanced features like localizations or replacement tags and only those files should be touched.
Building on that, rather than trying to guess when you should modify our manually uploaded files, you should just add a setting for the project that we can turn off. I do use the Curse localization system, but I don't use the Curse packager. The ZIP files I upload already have the localization data inserted, so there's no need for the Curse packager to do anything.
Furthermore, there's not really any excuse for making modifications (like changing line endings or making Unicode replacements) to existing code. If there's an "insert localization here" tag in a file, just insert the localization in place of the tag. If there's a block of code wrapped in "debug/end-debug" tags, just replace each tag with the appropriate symbols to start and end a block comment. If there's a "do-not-package" tag block, just delete the lines in between. No reason to touch any other lines in the file.
@Erorus: That would explain why I haven't seen an update for TUJ for a week.
@Kaelten: Thank you for looking into this Bryan. I know we've had issues in the past and I know we'll work this one out as well. I won't go into details as others have pretty much sumed it up quite well.
from what i saw on the TUJ / Stormspire forums, until Curse fixes their hungry addon chewer the updates for The Undermine data can be downloaded directly at https://addon.theunderminejournal.com
(updates are only published twice a week as usual, so don't download more than that... there isn't newer data published outside the update cycle)
Furthermore, there's not really any excuse for making modifications (like changing line endings or making Unicode replacements) to existing code. If there's an "insert localization here" tag in a file, just insert the localization in place of the tag. If there's a block of code wrapped in "debug/end-debug" tags, just replace each tag with the appropriate symbols to start and end a block comment. If there's a "do-not-package" tag block, just delete the lines in between. No reason to touch any other lines in the file.
Thats not how file handling works, however.
To modify and/or delete lines in a file, you need to read at least everything after those lines and re-write it, for sake of sanity everyone would just read the entire file (instead of half of it), modify it in memory, and write it out again. Its easy enough to imagine someone making a mistake by using some generic text reader that results in such encoding changes - unintentionally.
Thats not how file handling works, however. To modify and/or delete lines in a file, you need to read at least everything after those lines and re-write it, for sake of sanity everyone would just read the entire file (instead of half of it), modify it in memory, and write it out again.
I wasn't talking about the implementation details --- those are irrelevant and I don't care in the slightest what happens between "reading" and "rewriting" --- but about the end results. There's no good reason why the file that eventually gets distributed should have any differences from the originally uploaded version of the same file that aren't the direct result of text being added, removed, or commented based on packager tags.
It doesn't really matter why the Curse system was implemented the way it was, who implemented it, what they were thinking about when they implemented it, whether or not it was tested, etc. --- at the end of the day, it's still a problem that needs to be fixed.
The old packager was able to make insertions/changes to files without mangling the encoding or making other unrelated changes to the file contents, so if Curse is having trouble figuring out how to do that, they should just look at their own older code.
This has been fixed - files will no longer be modified by the localization processor unless the project has localization enabled, file encoding will no longer be affected, and neither will line endings.
It still seems to recompress the uploaded ZIP rather than serving the original (even if no files inside the ZIP are changed), which results in mismatched archive MD5 across different download sources. This rather increases the amount of effort required to make sure that curse is still serving the actual code uploaded.
even though the zip files are different, after extracting them both and doing a recursive sha-256 sum compare... the files from both archives are identical.
The localization system was tied to the packager under the old system, but is now a file processor task so that every project can now take advantage of it. If your project has localization enabled, the token replacements will be attempted and the results re-zipped.
I understand what it's doing, but almost every part of is undesirable to me.
I don't actually want to use localization injection, ever: if performed on uploaded ZIPs, it essentially produces an untested addon release. Even if the localization code itself is correct (and for \n-containing keys, it recently wasn't), bad localization data can break string.format calls within the addon. In practice, this means that if I used localization injection, I'd have to upload a version as alpha, wait for a modified archive, download that, test it, and reflag the version as release. This seems more annoying than my current approach of exporting and verifying localization data prior to generating an archive locally.
It would be nice if the token replacements could be turned off on a per-project basis. Failing that, it would be nice if the file processor task realized that it made no actual changes, and kept the original ZIP file instead of producing another version of it -- at least then the archive MD5 would be a clear indication of whether or not it changed anything.
With every zip file uploaded, CurseForge opens the zip, edits the lua, saves a new zip, and serves that new zip to users. This is unacceptable. Do not modify the files we send to you. Every byte is there for a reason. I understand you want to open the zip and look around, and that's fine. But if it passes your filters, you should post the file we sent you, unedited.
This was already noticed in this thread, but I have further details that go beyond line endings.
Specifically, CurseForge edits lua and toc files, changing line endings from 0x0A to 0x0D 0x0A, and replacing all bytes over 0x7F with the Unicode Replacement Character (0xEF 0xBF 0xBD).
News flash: not all strings are Unicode text! My addon stores data in lua strings in binary format to save space. A price for every item on every realm in the US and EU is already a lot of data; encoding it in non-binary strings would be very slow and inefficient.
Here is a valid-for-WoW lua file with every 8-bit byte: https://addon.theunderminejournal.com/curse/good.lua
Here is what it looks like after CurseForge edits it: https://addon.theunderminejournal.com/curse/bad.lua
Please post our zip files unedited as we sent them to you. My addon cannot be hosted on Curse until this is resolved.
We completely understand your frustration on this and will be investigating why this is happening.
Files should only be edited when you, the author, uses some of our advanced features like localizations or replacement tags and only those files should be touched.
@Erorus: That would explain why I haven't seen an update for TUJ for a week.
@Kaelten: Thank you for looking into this Bryan. I know we've had issues in the past and I know we'll work this one out as well. I won't go into details as others have pretty much sumed it up quite well.
@Phanx: Well said.
from what i saw on the TUJ / Stormspire forums, until Curse fixes their hungry addon chewer the updates for The Undermine data can be downloaded directly at https://addon.theunderminejournal.com
(updates are only published twice a week as usual, so don't download more than that... there isn't newer data published outside the update cycle)
I think this new file handling has a conceptual error.
What it does:
gets upload, unzip, read, rezip, host
What it should do:
gets upload, copies it
copy1: unzip, read, delete
copy2: host
I'm sure it's much more complicated, but am I anywhere close to what curse's addon upload is doing?
This has been fixed - files will no longer be modified by the localization processor unless the project has localization enabled, file encoding will no longer be affected, and neither will line endings.
Thank you. I have confirmed that the zip file I uploaded is the same one offered for download.
It still seems to recompress the uploaded ZIP rather than serving the original (even if no files inside the ZIP are changed), which results in mismatched archive MD5 across different download sources. This rather increases the amount of effort required to make sure that curse is still serving the actual code uploaded.
Please try again.
they look the same to me.. probably your test file was created/uploaded before the fix was applied.
https://addons-origin.cursecdn.com/files/2351/941/TheUndermineJournal.zip
vs
https://addon.theunderminejournal.com/TheUndermineJournal.zip (the 2016-12-03 update)
File: TheUndermineJournal-tuj.zip
CRC-32: 9b44e194
MD5: 8e794a839e9e6f7ee5e03383f725822b
SHA-1: 3c6b52dbdac69b12866a283dd902a0a87c00fa72
SHA-256: 2650e68e5ee9c345042bf7f142f306aa972dbc95509a901b7e95c278ff5f628b
File: TheUndermineJournal-curse.zip
CRC-32: 9b44e194
MD5: 8e794a839e9e6f7ee5e03383f725822b
SHA-1: 3c6b52dbdac69b12866a283dd902a0a87c00fa72
SHA-256: 2650e68e5ee9c345042bf7f142f306aa972dbc95509a901b7e95c278ff5f628b
It's a different project, with localization enabled (but not actually used in a way that allows Curse to substitute anything): I'm comparing https://wow.curseforge.com/projects/opie/files/2351980/download (md5: f3f3...) vs https://www.townlong-yak.com/opie/get/umber5 (md5: 1505...).
i see.. you're right.
even though the zip files are different, after extracting them both and doing a recursive sha-256 sum compare... the files from both archives are identical.
The localization system was tied to the packager under the old system, but is now a file processor task so that every project can now take advantage of it. If your project has localization enabled, the token replacements will be attempted and the results re-zipped.
I understand what it's doing, but almost every part of is undesirable to me.
I don't actually want to use localization injection, ever: if performed on uploaded ZIPs, it essentially produces an untested addon release. Even if the localization code itself is correct (and for \n-containing keys, it recently wasn't), bad localization data can break string.format calls within the addon. In practice, this means that if I used localization injection, I'd have to upload a version as alpha, wait for a modified archive, download that, test it, and reflag the version as release. This seems more annoying than my current approach of exporting and verifying localization data prior to generating an archive locally.
It would be nice if the token replacements could be turned off on a per-project basis. Failing that, it would be nice if the file processor task realized that it made no actual changes, and kept the original ZIP file instead of producing another version of it -- at least then the archive MD5 would be a clear indication of whether or not it changed anything.