A number of fixes
This commit is contained in:
parent
f46a7a8e89
commit
22ac1403b6
45
README.md
45
README.md
@ -14,7 +14,7 @@ Wrapping up the core of this long-experimental feature is the first step.
|
|||||||
(One of the reasons for that is that using dynamic derivations requires content-addressing derivations, because derivations themselves are always content-addressed.)
|
(One of the reasons for that is that using dynamic derivations requires content-addressing derivations, because derivations themselves are always content-addressed.)
|
||||||
|
|
||||||
Completely moving the whole ecosystem over to content-addressing derivations is the ultimate goal, but this doesn't need to coincide with wrapping up the core of the experiment.
|
Completely moving the whole ecosystem over to content-addressing derivations is the ultimate goal, but this doesn't need to coincide with wrapping up the core of the experiment.
|
||||||
For example, as others have written out, "sedding" binaries to rewrite self-references is unlikely to work in general.
|
For example, as others have written out, "`sed`-ing" binaries to rewrite self-references is unlikely to work in general.
|
||||||
That's fine for me
|
That's fine for me
|
||||||
--- we'll simply keep input-addressing in the cases where it doesn't work.
|
--- we'll simply keep input-addressing in the cases where it doesn't work.
|
||||||
(Not only is this expedient, this also incentivizes trying to modify packages to stop needing self-references, which I think is a good thing to do regardless.)
|
(Not only is this expedient, this also incentivizes trying to modify packages to stop needing self-references, which I think is a good thing to do regardless.)
|
||||||
@ -22,19 +22,19 @@ That's fine for me
|
|||||||
So what does "wrapping up the core of the experiment" entail?
|
So what does "wrapping up the core of the experiment" entail?
|
||||||
For the big test is "don't put junk in the cache".
|
For the big test is "don't put junk in the cache".
|
||||||
I am OK with the "client side" missing various conveniences, like tooling to understand trust map conflicts, or fancier garbage collection.
|
I am OK with the "client side" missing various conveniences, like tooling to understand trust map conflicts, or fancier garbage collection.
|
||||||
So long as there are still input-addressed Nixpkgs, no one will be "forced" to us them (by network effects) and so client UX issues can just be dodged by "just opting out".
|
So long as there is an still input-addressed Nixpkgs, no one will be "forced" to use them (by network effects) and so client UX issues can just be dodged by "just opting out".
|
||||||
On the "server side", however, I don't anything sketchy to be going on, because I don't want people to accidentally opt in to issues, especially highly nuanced "cache semantic" issues, that they didn't sign up for.
|
On the "server side", however, I don't want anything sketchy to be going on, because I don't want people to accidentally opt into issues, especially highly nuanced "cache semantics" issues, that they didn't sign up for.
|
||||||
Cached build artifacts, even local ones but especially shared internet-accessible ones, are potentially very long-lived.
|
Cached build artifacts, even local ones but especially shared internet-accessible ones, are potentially very long-lived.
|
||||||
If we get this wrong, we open ourselves up to "cache poisoning" issues, which because of the distributed nature of Nix stores and copying, may be hard to completely eradicate.
|
If we get the roll-out wrong, we open ourselves up to "cache poisoning" issues, which because of the distributed nature of Nix stores and copying, may be hard to completely eradicate.
|
||||||
I wouldn't want to be responsible for any of those.
|
I don't want content-addressing derivations to be responsible for any of those.
|
||||||
|
|
||||||
#### Medium level
|
#### Medium level
|
||||||
|
|
||||||
Drilling deeper, so what does "ensuring the binary cache is sound" entail?
|
Drilling deeper, what does "ensuring the binary cache is sound" entail?
|
||||||
I think the essential issue is [Nix#11896].
|
I think the essential issue is [Nix#11896].
|
||||||
"deep realisations" --- build trace key-value pairs where the key includes derivations that depend on other derivations' outputs --- are fundamentally ambiguous.
|
"deep realisations" --- build trace key-value pairs where the key includes derivations that depend on other derivations' outputs --- are fundamentally ambiguous.
|
||||||
This ambiguous makes them hard to verify/challenge, and hard to know when they conflict --- two deep realisations may implicitly make incompatible assumptions about the outputs of those dependency derivations.
|
This ambiguous makes them hard to verify/challenge, and hard to know when they conflict --- two deep realisations may implicitly make incompatible assumptions about the outputs of those dependency derivations.
|
||||||
We currently have a notion of "dependent realisations" that seeks to address this issue, but I do not think it is sound, and it is certainly not consistently implemented.
|
We currently have a notion of "dependent realisations" that seeks to address this issue, but I do not think this mechanism is sound, and it is certainly not consistently implemented.
|
||||||
|
|
||||||
The simplest thing to do is....just rip out deep realisations.
|
The simplest thing to do is....just rip out deep realisations.
|
||||||
Build trace keys should always be derivations that just depend on "opaque" store objects.
|
Build trace keys should always be derivations that just depend on "opaque" store objects.
|
||||||
@ -51,38 +51,39 @@ There are two downsides to "just do shallow addressing only" which are
|
|||||||
2. [Nix#11928] We regress with the current scheduling logic, causing build build-time inputs to be built/downloaded unnecessarily when the downstream thing we actually need should just be substitute exists but was built slightly differently.
|
2. [Nix#11928] We regress with the current scheduling logic, causing build build-time inputs to be built/downloaded unnecessarily when the downstream thing we actually need should just be substitute exists but was built slightly differently.
|
||||||
|
|
||||||
Re (1): once again, I am quite willing to defer polishing something that is client-side, and thus has problems that the user is free to side-step entirely by opting out.
|
Re (1): once again, I am quite willing to defer polishing something that is client-side, and thus has problems that the user is free to side-step entirely by opting out.
|
||||||
We can always delete *all* realisations
|
We can always delete *all* realisations locally
|
||||||
(there are no hard references between shallow realisations -- no "closure property"),
|
(there are no hard references between shallow realisations -- no "closure property"),
|
||||||
so that sledgehammer can always be exposed as a fail-safe way to unbreak anyone's machine running out of disk space.
|
so that sledgehammer can always be presented as a fail-safe last resort to unbreak anyone's machine that ran out of disk space.
|
||||||
Again, the current way we GC realisations (leveraging those "dependent realisations") is not necessarily a good or the only way to do things
|
Again, the current way we GC realisations (leveraging those "dependent realisations") is not necessarily a good or the only way to do things
|
||||||
--- in fact, because the relationships between realisations are "soft" and not "hard", I very this as a situation where there are many possible "policies", and choosing between them is a matter of opinion.
|
--- in fact, because the relationships between realisations are "soft" and not "hard", I very this as a situation where there are many possible "policies", and choosing between them is a matter of opinion.
|
||||||
Multiple policy/opinion territory is a clear place to cut scope for the first version.
|
Multiple policy/opinion territory is a clear place to cut scope for the first version.
|
||||||
|
|
||||||
Two however I consider more series
|
Downside two however I consider more series
|
||||||
--- it would be really annoying to always download GCC whenever you just want some cached binary built with Clang/some cached binary built with Clang.
|
--- it would be really annoying to always download GCC whenever you just want some cached binary built with GCC.
|
||||||
Yes, you can GC that Clang right away, but that just makes the problem seem sillier.
|
Yes, you can GC that GCC right away, so there is no wasted disk space, but there is still the wasted time waiting for the download, and wasted network usage.
|
||||||
|
Downloading to then delete is not a solution, but just exposes how artificial and silly the status quo is.
|
||||||
|
|
||||||
[Nix#11928] is this something I consider required to fix if we're going to get rid of deep realisations (as I propose).
|
[Nix#11928] is thus something I consider required to fix if we're going to get rid of deep realisations (as I propose).
|
||||||
The good thing is that we can simply change the scheduling logic so it's no longer a problem.
|
The good thing is that we can simply change the scheduling logic so it's no longer a problem.
|
||||||
The fix is conceptually simple enough: we can resolve derivations (normalize their inputs) without actually downloading those inputs.
|
The fix is conceptually simple enough: we can resolve derivations (normalize their inputs) without actually downloading those inputs.
|
||||||
We just look up build trace key-value pairs and substitute within the derivation accordingly.
|
We just look up build trace key-value pairs and substitute within the derivation accordingly.
|
||||||
The less good news is that it is a bit harder than it sounds to implement, because the scheduling code is currently such a confusing mess.
|
The less good news is that it is a bit harder than it sounds to implement, because the scheduling code was such a confusing mess.
|
||||||
|
|
||||||
#### Low level
|
#### Low level
|
||||||
|
|
||||||
This in turn leans me to [Nix#12663].
|
This in turn leans me to [Nix#12663].
|
||||||
To make progress on the schedule code (and actually a bunch of other issues, which I'll hopefully get to), we need to untangle scheduling and building.
|
To make progress on the schedule code (and actually a bunch of other issues, which I'll hopefully get to), we need to untangle scheduling and building.
|
||||||
Only then we'll we have a "clean workbench" upon which we can address reworking the scheduling logic for [Nix#11928] (and hte other issues too).
|
Only then we'll we have a "clean workbench" upon which we can address reworking the scheduling logic for [Nix#11928] (and the other issues too).
|
||||||
This might sound hard, but it actually isn't so bad --- it's just long overdue.
|
This might sound hard, but it actually isn't so bad --- it's just long overdue.
|
||||||
(*Not* doing this and attempting to fix the issues anyways is much harder.)
|
(*Not* doing this and attempting to fix the issues anyways is much harder.)
|
||||||
|
|
||||||
After Planet Nix, @L-as and I started on a "bottom up" approach to this, which is the one outlined in [Nix#12663].
|
After Planet Nix, @L-as and I started on a "bottom up" approach to this, which is the one outlined in [Nix#12663].
|
||||||
\[You should now just read that issue, it attempts to lay out a roadmap also --- if I said more here I would be just inlining the ticket.\]
|
\[You should now just read that issue, it attempts to lay out a roadmap also --- if I said more here I would be just inlining the ticket.\]
|
||||||
So far, we got [Nix#12630] and [Nix#12662] done, and have [Nix#12658] and [Nix#12658] "on deck".
|
So far, we got [Nix#12630], [Nix#12662], and [Nix#12658] done, and [Nix#12668] "on deck".
|
||||||
This will get local building pretty well "off to the side".
|
This will get local building pretty well "off to the side".
|
||||||
Then we do something similar for remote building (maybe just moving the hook code, or maybe indulging a little scope creep and getting rid of it altogether per [Nix#5025]).
|
Then we do something similar for remote building (maybe just moving the hook code, or maybe indulging a little scope creep and getting rid of it altogether per [Nix#5025]).
|
||||||
At that point, the building logic (local and remote cases) will be completely "out of the way", and we should be able to solve [Nix#11928].
|
At that point, the building logic (local and remote cases) will be completely "out of the way", and we should be able to solve [Nix#11928].
|
||||||
And at *that* point, we can (with some stop-gap for local GC) fix #11896, just ripping out shallow derivations.
|
And at *that* point, we can (with some stop-gap for local GC) fix [Nix#11896], just ripping out shallow derivations.
|
||||||
|
|
||||||
Along with / right after doing [Nix#11896], we can also do [Nix#11897].
|
Along with / right after doing [Nix#11896], we can also do [Nix#11897].
|
||||||
This is a good simple cleanup --- the scheduling changes and lack of deep realisations mean that there is absolutely use hash derivations "modulo fixed-output derivations", because resolved derivations never depend on fixed-output derivations (because they never depend on any derivation's output at all).
|
This is a good simple cleanup --- the scheduling changes and lack of deep realisations mean that there is absolutely use hash derivations "modulo fixed-output derivations", because resolved derivations never depend on fixed-output derivations (because they never depend on any derivation's output at all).
|
||||||
@ -91,7 +92,7 @@ We can go back to just using derivation paths.
|
|||||||
#### Hydra
|
#### Hydra
|
||||||
|
|
||||||
With the Nix changes done, the next task is getting Hydra to work with the revamped system.
|
With the Nix changes done, the next task is getting Hydra to work with the revamped system.
|
||||||
This is especially important given my "server first" approach --- I want to see us building at scale to find and eradicate problems before I worry about anyone actually building this stuff.
|
This is especially important given my "server first" approach --- I want to see us building at scale to find and eradicate problems before I worry about regular users actually using this stuff.
|
||||||
This should be a very simple fix --- Hydra already computes deep and shallow realisations and uploads both. It just needs to stop doing the former.
|
This should be a very simple fix --- Hydra already computes deep and shallow realisations and uploads both. It just needs to stop doing the former.
|
||||||
|
|
||||||
One interesting thing to note is we should also upload the resolved derivations that the shallow realisation refers to
|
One interesting thing to note is we should also upload the resolved derivations that the shallow realisation refers to
|
||||||
@ -113,8 +114,10 @@ The linked issue contains a discussion of alternatives, I lean towards something
|
|||||||
|
|
||||||
#### Rollout, Nixpkgs, RFC
|
#### Rollout, Nixpkgs, RFC
|
||||||
|
|
||||||
This is probably the most contentious part, and the least "technical stuff I can just do myself", so I don't want to speculate too much.
|
Whereas the above is mostly "technical stuff I can just *do* without having to ask anyone for for permission", this part is squarely on community by-in.
|
||||||
But basically I see a path like this:
|
I think what follows is a good process to follow, but, of course, no one knows for sure how the community will react until they do.
|
||||||
|
|
||||||
|
This is the roadmap I have in mind; the "...." indicates perhaps more intermediate steps to gain confidence in the new way things work before a major "flip the switch" milestone.
|
||||||
|
|
||||||
1. Implement and document, per the above
|
1. Implement and document, per the above
|
||||||
2. Do a lot of builds of Nixpkgs, publicly, with a public cache.
|
2. Do a lot of builds of Nixpkgs, publicly, with a public cache.
|
||||||
@ -180,7 +183,7 @@ Dynamic derivations is a relatively "cheap" extension to content-addressing deri
|
|||||||
|
|
||||||
[Nix#12630]: https://github.com/NixOS/nix/pull/12630
|
[Nix#12630]: https://github.com/NixOS/nix/pull/12630
|
||||||
[Nix#12658]: https://github.com/NixOS/nix/pull/12658
|
[Nix#12658]: https://github.com/NixOS/nix/pull/12658
|
||||||
[Nix#12658]: https://github.com/NixOS/nix/pull/12668
|
[Nix#12668]: https://github.com/NixOS/nix/pull/12668
|
||||||
[Nix#12662]: https://github.com/NixOS/nix/pull/12662
|
[Nix#12662]: https://github.com/NixOS/nix/pull/12662
|
||||||
[Nix#12662]: https://github.com/NixOS/nix/pull/12662
|
[Nix#12662]: https://github.com/NixOS/nix/pull/12662
|
||||||
[Nix#12591]: https://github.com/NixOS/nix/pull/12591
|
[Nix#12591]: https://github.com/NixOS/nix/pull/12591
|
||||||
|
Loading…
x
Reference in New Issue
Block a user