-
Notifications
You must be signed in to change notification settings - Fork 94
Description
Description
It should be possible for various sub-types of RevCommit and RevWalk to access and use the information stored in the commit graph. This is mostly relevant for EGit's SWTWalk and SWTCommit. These types cannot currently use the information (see additional context for why) from the commit graph, meaning that for example:
- An incremental implementation of the topological sorting akin to git cannot use commit graph generation numbers to produce commits incrementally. The entire history must be processed first
- The Changed Path Bloom filter cannot be used and a TreeWalk is performed for every commit in
TreeRevFilter.include()
both leading to a bad performance of EGit's History View
Motivation
Dicussion #243 describes the performance problems we are currently facing in regards to EGits History View when viewing the changes of the currently selected resource. I am currently working on improving the performance of JGit's topological sorting by allowing an incremental implementation similar to git itself (here).
During Development I noticed, that even though I generated a commit graph (git commit-graph write --changed-paths) and enabled commit graphs in the git config (core.commitGraph), my implementation of git's topological sorting algorithm (which uses generation numbers obtained from the commit graph) only ever saw Constants.COMMIT_GENERATION_UNKNOWN for getGeneration() (since the data from the commit graph is never parsed for SWTCommits).
Additionally the Changed Path Bloom filter was never used and a Tree Walk was performed for every commit.
Alternatives considered
One possibility is replacing the default implementation in RevCommit with concrete lookups in the commit graph, this has the following downsides:
- A new
CommitDataobject would be instantiated for every invocation ofgetGeneration()andgetChangedPathFilter() - Even if commit graph data is available, any
RevCommitthat is not aRevCommitCGwould still be parsed from the commit object instead of the commit graph (SeeRevCommit.parseCanonical()andRevCommitCG.parseInGraph())
My workaround for the incremental topological sorting algorithm is to query the RevWalk.commitGraph() if RevCommit.getGeneration() does not return a usable value: https://github.com/vi-eclipse/jgit/blob/06b19880b694b20a7e76f388345a80fc64243359/org.eclipse.jgit/src/org/eclipse/jgit/revwalk/TopoSortPendingGenerator.java#L287-L305
This approach is not possible for ChangedPathTreeFilter.shouldTreeWalk(), since it cannot access the package-private RevWalk.commitGraph()
Additional context
RevCommit currently offers two APIs that provide information stored in the Commit Graph:
getGeneration()for determining the Generation Number of a commitgetChangedPathFilter()for retrieving a bloom filter for the modified paths
RevCommit itself always returns Constants.COMMIT_GENERATION_UNKNOWN for getGeneration() and null for getChangedPathFilter(), these Methods are only overridden in RevCommitCG. RevCommitCG instances are created by a RevWalk when a position in the commit graph can be found for the desired commit.
jgit/org.eclipse.jgit/src/org/eclipse/jgit/revwalk/RevWalk.java
Lines 1772 to 1777 in 42f6237
| private RevCommit createCommit(AnyObjectId id, int graphPos) { | |
| if (graphPos >= 0) { | |
| return new RevCommitCG(id, graphPos); | |
| } | |
| return new RevCommit(id); | |
| } |
However, this only applies when strictly using
RevWalk, when using a subtype of RevWalk (i.e. GitHistoryWalk), none of the information present in the commit graph is used, even though it could be used to optimize both topological sorting and filtering by Changed paths.Subtypes of
RevCommit, which are located in a different package (such as PlotCommit), cannot currently override and implement getGeneration() and getChangedPathFilter() because:
RevCommit.getGeneration()has package-private visibility- Generation Numbers (and other information from the CommitData) for
RevCommitCGare parsed by overriding theRevCommit.parseCanonical()method, which also has package-private visibility - Even if subtypes of
RevCommitoverride and implementgetChangedPathFilter(), they cannot retrieve the Commit Graph via theRevWalk.commitGraph()method, since it has package-private visibility