git config -f .gitmodules submodule.<name>.shallow true
See "Git submodule without extra weight" for more.
Git 2.13 (Q2 2017) do add in commit 8d3047c (19 Apr 2017) by Sebastian Schuberth (sschuberth
).
(Merged by Sebastian Schuberth -- sschuberth
-- in commit 8d3047c, 20 Apr 2017)
a clone of this submodule will be performed as a shallow clone (with a history depth of 1)
However, Ciro Santilli adds in the comments (and details in his answer)
shallow = true
on .gitmodules
only affects the reference tracked by the HEAD of the remote when using --recurse-submodules
, even if the target commit is pointed to by a branch, and even if you put branch = mybranch
on the .gitmodules
as well.
Git 2.20 (Q4 2018) improves on the submodule support, which has been updated to read from the blob at HEAD:.gitmodules
when the .gitmodules
file is missing from the working tree.
See commit 2b1257e, commit 76e9bdc (25 Oct 2018), and commit b5c259f, commit 23dd8f5, commit b2faad4, commit 2502ffc, commit 996df4d, commit d1b13df, commit 45f5ef3, commit bcbc780 (05 Oct 2018) by Antonio Ospite (ao2
).
(Merged by Junio C Hamano -- gitster
-- in commit abb4824, 13 Nov 2018)
submodule
: support reading .gitmodules
when it's not in the working tree
When the .gitmodules
file is not available in the working tree, try
using the content from the index and from the current branch.
This covers the case when the file is part of the repository but for some
reason it is not checked out, for example because of a sparse checkout.
This makes it possible to use at least the 'git submodule
' commands
which read the gitmodules
configuration file without fully populating
the working tree.
Writing to .gitmodules
will still require that the file is checked out,
so check for that before calling config_set_in_gitmodules_file_gently
.
Add a similar check also in git-submodule.sh::cmd_add()
to anticipate the eventual failure of the "git submodule add
" command when .gitmodules
is not safely writeable; this prevents the command from leaving the repository in a spurious state (e.g. the submodule repository was cloned but .gitmodules
was not updated because config_set_in_gitmodules_file_gently
failed).
Moreover, since config_from_gitmodules()
now accesses the global object
store, it is necessary to protect all code paths which call the function
against concurrent access to the global object store.
Currently this only happens in builtin/grep.c::grep_submodules()
, so call
grep_read_lock()
before invoking code involving config_from_gitmodules()
.
NOTE: there is one rare case where this new feature does not work
properly yet: nested submodules without .gitmodules
in their working tree.
Note: Git 2.24 (Q4 2019) fixes a possible segfault when cloning a submodule shallow.
See commit ddb3c85 (30 Sep 2019) by Ali Utku Selen (auselen
).
(Merged by Junio C Hamano -- gitster
-- in commit 678a9ca, 09 Oct 2019)
Git 2.25 (Q1 2020), clarifies the git submodule update
documentation.
See commit f0e58b3 (24 Nov 2019) by Philippe Blain (phil-blain
).
(Merged by Junio C Hamano -- gitster
-- in commit ef61045, 05 Dec 2019)
doc
: mention that 'git submodule update' fetches missing commits
Helped-by: Junio C Hamano
Helped-by: Johannes Schindelin
Signed-off-by: Philippe Blain
'git submodule
update' will fetch new commits from the submodule remote if the SHA-1 recorded in the superproject is not found. This was not mentioned in the documentation.
Warning: With Git 2.25 (Q1 2020), the interaction between "git clone --recurse-submodules
" and alternate object store was ill-designed.
The documentation and code have been taught to make more clear recommendations when the users see failures.
See commit 4f3e57e, commit 10c64a0 (02 Dec 2019) by Jonathan Tan (jhowtan
).
(Merged by Junio C Hamano -- gitster
-- in commit 5dd1d59, 10 Dec 2019)
Signed-off-by: Jonathan Tan
Acked-by: Jeff King
When recursively cloning a superproject with some shallow modules defined in its .gitmodules
, then recloning with "--reference=<path>
", an error occurs. For example:
git clone --recurse-submodules --branch=master -j8 \
https://android.googlesource.com/platform/superproject \
master
git clone --recurse-submodules --branch=master -j8 \
https://android.googlesource.com/platform/superproject \
--reference master master2
fails with:
fatal: submodule '<snip>' cannot add alternate: reference repository
'<snip>' is shallow
When a alternate computed from the superproject's alternate cannot be added, whether in this case or another, advise about configuring the "submodule.alternateErrorStrategy
" configuration option and using "--reference-if-able
" instead of "--reference
" when cloning.
That is detailed in:
With Git 2.25 (Q1 2020), The interaction between "git clone --recurse-submodules" and alternate object store was ill-designed.
Doc
: explain submodule.alternateErrorStrategy
Signed-off-by: Jonathan Tan
Acked-by: Jeff King
Commit 31224cbdc7 ("clone
: recursive and reference option triggers submodule alternates", 2016-08-17, Git v2.11.0-rc0 -- merge listed in batch #1) taught Git to support the configuration options "submodule.alternateLocation
" and "submodule.alternateErrorStrategy
" on a superproject.
If "submodule.alternateLocation
" is configured to "superproject
" on a superproject, whenever a submodule of that superproject is cloned, it instead computes the analogous alternate path for that submodule from $GIT_DIR/objects/info/alternates
of the superproject, and references it.
The "submodule.alternateErrorStrategy
" option determines what happens if that alternate cannot be referenced.
However, it is not clear that the clone proceeds as if no alternate was specified when that option is not set to "die" (as can be seen in the tests in 31224cbdc7).
Therefore, document it accordingly.
The config submodule documentation now includes:
submodule.alternateErrorStrategy::
Specifies how to treat errors with the alternates for a submodule as computed via submodule.alternateLocation
.
Possible values are ignore
, info
, die
.
Default is die
.
Note that if set to ignore
or info
, and if there is an error with the computed alternate, the clone proceeds as if no alternate was specified.
Note: "git submodule update --quiet
"(man) did not propagate the quiet option down to underlying git fetch
(man), which has been corrected with Git 2.32 (Q2 2021).
See commit 62af4bd (30 Apr 2021) by Nicholas Clark (nwc10
).
(Merged by Junio C Hamano -- gitster
-- in commit 74339f8, 11 May 2021)
submodule update
: silence underlying fetch with "--quiet
"
Signed-off-by: Nicholas Clark
Commands such as
$ git submodule update --quiet --init --depth=1
involving shallow clones, call the shell function fetch_in_submodule,
which in turn invokes git fetch
.
Pass the --quiet
option onward there.
–
–
–
–
–
–
–
Following Ryan's answer I was able to come up with this simple script which iterates through all submodules and shallow clones them:
#!/bin/bash
git submodule init
for i in $(git submodule | sed -e 's/.* //'); do
spath=$(git config -f .gitmodules --get submodule.$i.path)
surl=$(git config -f .gitmodules --get submodule.$i.url)
git clone --depth 1 $surl $spath
git submodule update
–
–
–
Reading through the git-submodule "source", it looks like git submodule add
can handle submodules that already have their repositories present. In that case...
$ git clone $remote1 $repo
$ cd $repo
$ git clone --depth 5 $remotesub1 $sub1
$ git submodule add $remotesub1 $sub1
#repeat as necessary...
You'll want to make sure the required commit is in the submodule repo, so make sure you set an appropriate --depth.
Edit: You may be able to get away with multiple manual submodule clones followed by a single update:
$ git clone $remote1 $repo
$ cd $repo
$ git clone --depth 5 $remotesub1 $sub1
#repeat as necessary...
$ git submodule update
–
shallow = true
in .gitmodules
only affects git clone --recurse-submodules
if the HEAD
of the remote submodule points to the required commit, even if the target commit is pointed to by a branch, and even if you put branch = mybranch
on the .gitmodules
as well.
Local test script. Same behaviour on GitHub 2017-11, where HEAD
is controlled by the default branch repo setting:
git clone --recurse-submodules https://github.com/cirosantilli/test-shallow-submodule-top-branch-shallow
cd test-shallow-submodule-top-branch-shallow/mod
git log
# Multiple commits, not shallow.
git clone --recurse-submodules --shallow-submodules
fails if the commit is neither referenced by a branch or tag with a message: error: Server does not allow request for unadvertised object
.
Local test script. Same behaviour on GitHub:
git clone --recurse-submodules --shallow-submodules https://github.com/cirosantilli/test-shallow-submodule-top-sha
# error
I also asked on the mailing list: https://marc.info/?l=git&m=151863590026582&w=2 and the reply was:
In theory this should be easy. :)
In practice not so much, unfortunately. This is because cloning will just obtain
the latest tip of a branch (usually master). There is no mechanism in clone
to specify the exact sha1 that is wanted.
The wire protocol supports for asking exact sha1s, so that should be covered.
(Caveat: it only works if the server operator enables
uploadpack.allowReachableSHA1InWant which github has not AFAICT)
git-fetch allows to fetch arbitrary sha1, so as a workaround you can run a fetch
after the recursive clone by using "git submodule update" as that will use
fetches after the initial clone.
TODO test: allowReachableSHA1InWant
.
–
Are the canonical locations for your submodules remote? If so, are you OK with cloning them once? In other words, do you want the shallow clones just because you are suffering the wasted bandwidth of frequent submodule (re)clones?
If you want shallow clones to save local diskspace, then Ryan Graham's answer seems like a good way to go. Manually clone the repositories so that they are shallow. If you think it would be useful, adapt git submodule
to support it. Send an email to the list asking about it (advice for implementing it, suggestions on the interface, etc.). In my opinion, the folks there are quite supportive of potential contributors that earnestly want to enhance Git in constructive ways.
If you are OK with doing one full clone of each submodule (plus later fetches to keep them up to date), you might try using the --reference
option of git submodule update
(it is in Git 1.6.4 and later) to refer to local object stores (e.g. make --mirror
clones of the canonical submodule repositories, then use --reference
in your submodules to point to these local clones). Just be sure to read about git clone --reference
/git clone --shared
before using --reference
. The only likely problem with referencing mirrors would be if they ever end up fetching non-fast-forward updates (though you could enable reflogs and expand their expiration windows to help retain any abandoned commits that might cause a problem). You should not have any problems as long as
you do not make any local submodule commits, or
any commits that are left dangling by non-fast-forwards that the canonical repositories might publish are not ancestors to your local submodule commits, or
you are diligent about keeping your local submodule commits rebased on top of whatever non-fast-forwards might be published in the canonical submodule repositories.
If you go with something like this and there is any chance that you might carry local submodule commits in your working trees, it would probably be a good idea to create an automated system that makes sure critical objects referenced by the checked-out submodules are not left dangling in the mirror repositories (and if any are found, copies them to the repositories that need them).
And, like the git clone
manpage says, do not use --reference
if you do not understand these implications.
# Full clone (mirror), done once.
git clone --mirror $sub1_url $path_to_mirrors/$sub1_name.git
git clone --mirror $sub2_url $path_to_mirrors/$sub2_name.git
# Reference the full clones any time you initialize a submodule
git clone $super_url super
cd super
git submodule update --init --reference $path_to_mirrors/$sub1_name.git $sub1_path_in_super
git submodule update --init --reference $path_to_mirrors/$sub2_name.git $sub2_path_in_super
# To avoid extra packs in each of the superprojects' submodules,
# update the mirror clones before any pull/merge in super-projects.
for p in $path_to_mirrors/*.git; do GIT_DIR="$p" git fetch; done
cd super
git pull # merges in new versions of submodules
git submodule update # update sub refs, checkout new versions,
# but no download since they reference the updated mirrors
Alternatively, instead of --reference
, you could use the mirror clones in combination with the default hardlinking functionality of git clone
by using local mirrors as the source for your submodules. In new super-project clones, do git submodule init
, edit the submodule URLs in .git/config
to point to the local mirrors, then do git submodule update
. You would need to reclone any existing checked-out submodules to get the hardlinks. You would save bandwidth by only downloading once into the mirrors, then fetching locally from those into your checked-out submodules. The hard linking would save disk space (although fetches would tend to accumulate and be duplicated across multiple instances of the checked-out submodules' object stores; you could periodically reclone the checked-out submodules from the mirrors to regain the disk space saving provided by hardlinking).
Reference to How to clone git repository with specific revision/changeset?
I have written a simple script which has no problem when your submodule reference is away from the master
git submodule foreach --recursive 'git rev-parse HEAD | xargs -I {} git fetch origin {} && git reset --hard FETCH_HEAD'
This statement will fetch the referenced version of submodule.
It is fast but you cannot commit your edit on the submodule (you have to fetch unshallow it before https://stackoverflow.com/a/17937889/3156509)
in full:
#!/bin/bash
git submodule init
git submodule foreach --recursive 'git rev-parse HEAD | xargs -I {} git fetch origin {} && git reset --hard FETCH_HEAD'
git submodule update --recursive
I created a slightly different version, for when it's not running at the bleeding edge, which not all projects do. The standard submodule additions did't work nor did the script above. So I added a hash lookup for the tag ref, and if it doesn't have one, it falls back to full clone.
#!/bin/bash
git submodule init
git submodule | while read hash name junk; do
spath=$(git config -f .gitmodules --get submodule.$name.path)
surl=$(git config -f .gitmodules --get submodule.$name.url)
sbr=$(git ls-remote --tags $surl | sed -r "/${hash:1}/ s|^.*tags/([^^]+).*\$|\1|p;d")
if [ -z $sbr ]; then
git clone $surl $spath
git clone -b $sbr --depth 1 --single-branch $surl $spath
git submodule update
Shallow clone of a submodule is perfect because they snapshot at a particular revision/changeset. It's easy to download a zip from the website so I tried for a script.
#!/bin/bash
git submodule deinit --all -f
for value in $(git submodule | perl -pe 's/.*(\w{40})\s([^\s]+).*/\1:\2/'); do
mysha=${value%:*}
mysub=${value#*:}
myurl=$(grep -A2 -Pi "path = $mysub" .gitmodules | grep -Pio '(?<=url =).*/[^.]+')
mydir=$(dirname $mysub)
wget $myurl/archive/$mysha.zip
unzip $mysha.zip -d $mydir
test -d $mysub && rm -rf $mysub
mv $mydir/*-$mysha $mysub
rm $mysha.zip
git submodule init
git submodule deinit --all -f
clears the submodule tree which allows the script to be reusable.
git submodule
retrieves the 40 char sha1 followed by a path that corresponds to the same in .gitmodules
. I use perl to concatenate this information, delimited by a colon, then employ variable transformation to separate the values into mysha
and mysub
.
These are the critical keys because we need the sha1 to download and the path to correlate the url
in .gitmodules.
Given a typical submodule entry:
[submodule "label"]
path = localpath
url = https://github.com/repository.git
myurl
keys on path =
then looks 2 lines after to get the value. This method may not work consistently and require refinement. The url grep strips any remaining .git
type references by matching to the last /
and anything up to a .
.
mydir
is mysub
minus a final /name
which would by the directory leading up to the submodule name.
Next is a wget
with the format of downloadable zip archive url. This may change in future.
Unzip the file to mydir
which would be the subdirectory specified in the submodule path. The resultant folder will be the last element of the url
-sha1
.
Check to see if the subdirectory specified in the submodule path exists and remove it to allow renaming of the extracted folder.
mv
rename the extracted folder containing our sha1 to its correct submodule path.
Delete downloaded zip file.
Submodule init
This is more a WIP proof of concept rather than a solution. When it works, the result is a shallow clone of a submodule at a specified changeset.
Should the repository re-home a submodule to a different commit, re-run the script to update.
The only time a script like this would be useful is for non-collaborative local building of a source project.
I needed a solution to shallow clone submodules when I can not effect on cloning of main repo.
Based on one solution above:
#!/bin/bash
git submodule init
for i in $(git submodule | sed -e 's/.* //'); do
git submodule update --init --depth 1 -- $i
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.