MDEV-38010: Master/Relay Log Info files ignore trailing garbage in numeric lines#4752
MDEV-38010: Master/Relay Log Info files ignore trailing garbage in numeric lines#4752ayush-jha123 wants to merge 1 commit intoMariaDB:mainfrom
Conversation
…meric lines This patch fixes an issue where Int_IO_CACHE::from_chars stops parsing at the first invalid character but fails to consume the remainder of the line. This caused trailing garbage on a numeric field (like Master_Port) to be interpreted as the value for the subsequent field. The fix adds a loop to consume the buffer up to the newline character or EOF if my_strtoll10 returns early.
There was a problem hiding this comment.
By stopping early, #4430 has already fixed this bug (or rather, what MDEV-38010 actually described) on the main branch as part of MDEV-37530.
Please retarget to 10.11, and add a test (refer to the bug report’s examples) while you’re there.
(I dearly hope this is not an AI being confident at an nonexistent bug.)
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! Thss is a preliminary review.
First of all: this is a bug fix. So please re-base on the first affected version. Jira says this is 10.11.
Secondly: please add a test.
| { | ||
| int c; | ||
| do { | ||
| c= my_b_get(file); |
There was a problem hiding this comment.
The whole line, complete with the newline, is already read from the file into a buffer. Let's say it's "123b c\n". Before your change the thing would have read "123" from the buffer and errored out since the end of the line is not reached into the buffer immediately after.
With your change, after reading 123 from the buffer, the code will start reading from the file until it reaches a new line symbol. This will result in completely skipping the next line instead of parsing it.
To tell you frankly the whole idea behind "ignore unrecognized (e.g., non-numeric) content at the end of the line" is a bit exotic to me. I'd rather error out and return an error. As garbage after the expected value could be an indicator of a garbled file.
But, feel free to ignore the last paragraph and leave it to the final reviewer to decide.
There was a problem hiding this comment.
Thank you for the feedback and for catching that logic flaw! You are completely right—my previous approach of reading the rest of the file would inadvertently skip the next valid configuration line since the current line was already in the buffer.
I also agree with your point about erroring out rather than trying to gracefully ignore corrupted data. If a master.info file has trailing garbage on a numeric field, it is safer to reject it as garbled rather than risk loading incorrect replication parameters.
I have retargeted this PR to the 10.11 branch as requested, and I've updated the patch. The parsing logic (using strtol / sscanf) now checks if there are any non-whitespace characters left after reading the number. If trailing garbage is detected, it instantly returns an error. I have also added a test case to verify this behavior.
Thanks again for the guidance!
|
Thanks for the review and suggestions. I'm currently working on rebasing the patch onto the 10.11 branch and revisiting the parsing logic based on your comments. I will also add a test case derived from the examples in the bug report. I'll push an updated version shortly. |
Fixes https://jira.mariadb.org/browse/MDEV-38010
Description:
This patch fixes an issue where
Int_IO_CACHE::from_charsstops parsing at the first invalid character but fails to consume the remainder of the line.Previously, this caused trailing garbage on a numeric field (e.g.,
Master_Port) to be interpreted as the value for the subsequent field, leading to corrupted configurations. The fix adds a loop to consume the buffer up to the newline character or EOF ifmy_strtoll10returns early, safely discarding the trailing text.