New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proc: Bad value for 'hidepid' #16896
Comments
You can ignore those warnings. Story goes like this: since 4e39995 we'll make use of the hidepid option, but this is only safe on kernel >= 5.8 where hidepid is a true per-mount option (previously it was per-userns, but weirdly exposed as mount option). The kernel developer behind this work recommended way to detect whether hidpid is per mount option or per userns is to use the new textual argument to hidepid, i.e. hidepid=invisible instead of hidepid=2. The same kernels that understand the former are also the ones that support the new-style per-mount hidepid. And that's what we do here. If the mount fails, we'll gracefully fall back, since the feature cannot be supported on older kernels. Yes, it's unfortunate to see the log spew, but you'll only see it when using new systemd on old kernels. And it doesn't mean much, except that you have an older kernel. i don't think we can fix or improve anything here. Hence closing. Hope tht makes sense. |
Please don't enable it then on kernels < 5.8 |
We don't "enable" kernel log messages. We generally check for features, not kernel versions. Because people backport like there was no tomorrow. Kernel version checks are only last resort. And upstream kernel peeps recommended to check for "hidepid=invisible" support, and that's what we are doing hence. Again, this is cosmetic only. And only when new systemd is combined with old kernel. |
What I meant is, that |
As mentioned, we generally try hard to avoid kernel version checks because people backport. Like in this case: we do a feature check with a graceful fallback instead of a version check. Its more robust and prettier, except that on old kernels you sometimes see a kernel log message. |
hi @poettering, fwiw I completely agree with you that it's better to check for features instead for version numbers. We should't bee too nonchalant with generating (unnecessary) error messages when we can avoid them. It's just noise that confuses and worries users. And trust me, users will see those messages. |
It is attempted to be used. And if it doesn't work there's a graceful fallback in place. And it's not an error, it's a log message you can safely ignore.
I am sorry, but this is purely cosmetic, and doesn't matter on newer kernels anyway. Let's not make our code worse just to for superficial cosmetics on old kernels people are unlikely to use with brand-new systemd anyway... |
Unfortunately this is a response I hear too often. How are users supposed to know which error messages are safe to ignore or not? If we are not more careful, log messages become useless because of all the noise. |
Needless to say I don't agree with your take on this issue. |
PID 1 could (and, IMO, should) log a message stating to the effect of “Checking if kernel supports |
I think we should do two things:
|
Caching is nasty, since this code is run in short-lived children forked off PID 1, i.e. the processes that eventually become the daemons after one more execve(): i.e. it's not as easy as maintaining a cache in a global variable, since they already are in forked children with their own copied memory. |
Logging before the mounting is ugly, since this would mean we'd also log on current kernels. and that's just noisy. We could however log a message when we notice the hidepid= thing fails, i.e. only after the fact, on old kernels. |
we really need a way to filter messages in journald.... @poettering I know you don't like the idea, but we should be able to reach a stage where there is no log at all in normal operations... Logs fill up disk space, cause wear of flash devices and embedded devices can be isolated for decades. Disk filling up or wearing is a deathclock ticking on the device. We need to log errors, but we must be able to not log any "normal log" they are the slow killers of embedded devices (kinda offtopic, sorry) |
@boucman the offending log msg this issues is about is done by the kernel, not userspace. |
yes I know, but it will still end up in the journal and there is very little we can do about it. Anyway, that's off topic and I don't want to highjack the original thread... |
@boucman can journald be told to just log only high-severity messages? Also if your device is going for decades without firmware updates you have a major security problem, unless it has no persistent mutable state and no network access. @poettering we could have the child report its result to PID 1 over a socketpair. |
we could also use a touch file or so, but i'd rather not have such complexity in place for a cosmetic issue. i think adding a log message is fine, but adding non-trivial caching infra that requires IPC or otherwise leaks out of process memory is a bit over the top for the fact that this is a very temporary issue only, i.e. when people mix new userspace with old kernels. |
How about an option in some conf (e.g. /etc/systemd/system.conf) that allows user to disable the cause/feature manually? |
There is still
It is an error message (level 3), appears red in I do not agree with @boucman that no logs are a goal in any way. There are many reasons why one needs or wants to go though informational or even debug logs. And
I know it's not always easily possible, probably we have to live with this "expected" error message, but it should be taken a reasonable and serious request, worth to spend some time, to avoid it. |
No way. Every option comes at a cost. And it's totally not worth adding one this is a purely cosmetic issue, visible only if you use old kernels with new systemd versions, which very soon is a thing of the past anyway. Quite frankly, I think we should close this, it's just not important. It's cosmetic and will fix itself as people upgrade kernels. |
Can you ensure this log message will never occur in the situation, in which it requires attention? Including systems not using systemd? If no, how can one quickly determine if this occurence can be ignored or not? If there is no simple way to make sure for every single occurence of this error message that it can be ignored, there should be for sure something done about it. |
We issue a mount() call with hidpid=, which works fine on recent kernels, and fails + logs on old kernels. The failure we see and then retry without hidepid=, as a fallback compat to old kernels. The log you still see though, there's no way to suppress that in the Linux kernel. My recommendation, if this cosmetic issue is too much for you to handle: please update systemd and kernel reasonably in sync: run new systemd on new kernels and old systemd on old kernels. It's the mix and match where this happens: new systemd on old kernels. where we try to make use of new kernel features, but because you run legacy kernels we need to fall back. |
It's not answering my question. Let me explain it a bit more: Let's assume that something else, that is mission critical, is triggering the same log message from kernel in my system - is this even possible when using systemd? Can something in my system trigger mount directly? If yes, how can I easily determine the origin of the specific message in my kernel log? Is there a way to quickly check if this specific message comes from systemd trying to probe my kernel or if this comes from some other software that may fail to operate properly because of that error? I assume that for other systems that do not run systemd I should treat this exact message as an error. |
Let's hope that whatever that "mission critical" thing is you are talking about that it checks errors from system calls. Because exactly like in systemd, if they try the mount option and it's not supported the syscall fails + the log is generated. If your "mission critical" thing ignores errors like this it probably shouldn't be something you deploy as "mission critical". You know, tons of syscalls can fail, and they do all the time. 99.999% don#t log about it though. And things are fine, because apps handle the error and do something smart, either handle things in some other way, or log about it or so. It's how this is done: fail on error so that the caller can do something about it. Now, in this very specific case some operation fails, and this is correctly returned as failure code to the app, but then also logged about. And that last part is ugly and unnecessary, but can't be addressed, since it's in old software. Anyway, if you really think that this cosmetic issue is worth discussing in so much detail, I really don#t want to see the email threads you open about all the rest of the log messages you see in dmesg... anyway, please upgrade your kernel or downgrade your systemd if this log message is really so terrible to endure... |
Just for the record, Yocto Project is looking like it will have to carry a patch adding a version check to systemd as the kernel warnings are creating too many problems for our developers getting reports from users about "new errors" in logs. |
OK, I don't think we should change anything here. It's purely cosmetic, and sooner or later will go away anyway once people upgrade their kernels (or downgrade systemd). |
The thing is it is not purely cosmetic. The reports of these errors takes up serious amounts of time for developers having to handle the reports from concerned users who don't know better so we're carrying yet more systemd patches which can never be upstream. |
I don't agree. It's just lazy ignoring issues like this. |
Once the patch is ready, please submit it as PR here. If it's not too ugly, we can merge it. And even if not, it'll make it easy to cherry-pick into distros that want it. |
Please no version checks for cosmetic stuff. Feature checks are ok. Version checks suck and are last resort stuff. People backport all the time. Not worth doing this for cosmetic issues. My recommendation: just turn of the sandboxing option on your distro, i.e. return to the status quo ante. You dont need the sandboxing, the stuff will work without. If cosmetic issues are that much of a priority to you then maybe just drop the sandboxing. |
The distros using newer systemd with old kernels could patch the old kernel to not emit this error message. |
As of commit 4e39995 ("core: introduce ProtectProc= and ProcSubset= to expose hidepid= and subset= procfs mount options") kernels older than v5.8 generate multple warnings at boot, as seen in this Yocto build from today: qemux86-64 login: root [ 65.829009] proc: Bad value for 'hidepid' root@qemux86-64:~# dmesg|grep proc: [ 16.990706] proc: Bad value for 'hidepid' [ 28.060178] proc: Bad value for 'hidepid' [ 28.874229] proc: Bad value for 'hidepid' [ 32.685107] proc: Bad value for 'hidepid' [ 65.829009] proc: Bad value for 'hidepid' root@qemux86-64:~# We see reports of the issue as in general its hard to someone to tell the difference between an error in dmesg which they should worry about and one that is harmless. This adds support burden to developers so Yocto Project has added this patch. The commit that triggers this is systemd v247-rc1~378^2~3 -- so any systemd 247 and above plus kernel v5.7 or older will need this. As noted in systemd#16896 it is possible changes could be backported to different kernel versions so the test isn't 100% foolproof but does give better results than a continual stream of bug reports. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
I have opened a pull request for it as mentioned above since it would seem to make sense to at least share the patch. |
As of commit 4e39995 ("core: introduce ProtectProc= and ProcSubset= to expose hidepid= and subset= procfs mount options") kernels older than v5.8 generate multple warnings at boot, as seen in this Yocto build from today: qemux86-64 login: root [ 65.829009] proc: Bad value for 'hidepid' root@qemux86-64:~# dmesg|grep proc: [ 16.990706] proc: Bad value for 'hidepid' [ 28.060178] proc: Bad value for 'hidepid' [ 28.874229] proc: Bad value for 'hidepid' [ 32.685107] proc: Bad value for 'hidepid' [ 65.829009] proc: Bad value for 'hidepid' root@qemux86-64:~# We see reports of the issue as in general its hard to someone to tell the difference between an error in dmesg which they should worry about and one that is harmless. This adds support burden to developers so Yocto Project has added this patch. The commit that triggers this is systemd v247-rc1~378^2~3 -- so any systemd 247 and above plus kernel v5.7 or older will need this. As noted in systemd#16896 it is possible changes could be backported to different kernel versions so the test isn't 100% foolproof but does give better results than a continual stream of bug reports. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
With latest git master, I see
I think those error messages showed up once
ProtectProc
was merged.The text was updated successfully, but these errors were encountered: