Git Submodules vs. Google’s Repo Tool

Last updated on Mar 16,2023 16.1K Views

Anton Weiss
Anton Weiss is a DevOps Enablement Expert with hands-on experience throughout the... Anton Weiss is a DevOps Enablement Expert with hands-on experience throughout the software development life cycle. At Otomato, he consults for DevOps, architecture and...

I was recently asked by a customer to outline the pros and cons of using git submodules vs. google repo tool to manage multi­-repository integrations in git.

There are a lot of articles on the internet bashing each of the tools, but in our opinion ­ most of it comes from misunderstanding the tool’s design or trying to apply it in an unappropriate context.

This post summarizes the general rules of thumb we at Otomato follow when choosing a solution for this admittedly nontrivial situation.

First of all – whenever possible -­ we recommend integrating your components on binary package level rather than compiling everything from source each time. I.e. : packaging components to jars, npms, eggs, rpms or docker images, uploading to a binary repo and pulling in as versioned dependencies during the build. You can learn more from the Google cloud architect certification.

Still -­ sometimes this is not an optimal solution, especially if you do a lot of feature branch development (which in itself is an anti­pattern in classical Continuous Delivery approach – see here for example).

For these cases we stick to the following guidelines.

Git Submodules:

Pros:

1. An integrated solution, part of git since v1.5

2. Deterministic relationship definition (parent project always points to a specific commit in submodule)

3. Integration points are recorded in parent repo.

4. Easy to recreate historical configurations.

5. Total separation of lifecycles between the parent and the submodules.

6. Supported by jenkins git plugin.

Cons:

1. Management overhead. (Need separate clones to introduce changes in submodules)

2. Developers get confused if they don’t understand the inner mechanics.

3. Need for additional commands (‘clone ­­recursive’ and ‘submodule update’)

4. External tools support is not perfect (bitbucket, sourcetree, ide plugins) ­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­

Google Repo:

Pros:

1. Tracking synchronized development effort is easier.

2. Gerrit integration (?)

3. A separate jenkins plugin.

Cons:

1. An external obscure mechanism

2. Requires an extra repository for management.

3. Non­deterministic relationship definition (each repo version can be defined as a floating head)

4. Hard to reconstruct old versions.

5. No support in bitbucket or gui tools.

In general : Whenever we want to integrate separate decoupled components with distinct lifecycles ­ I recommend submodules over repo, but their implementation must come with proper education regarding the special workflow they require. In the long run it pays off ­ as integration points can be managed in deterministic manner and with knowledge comes the certainty in the tool.

If you find your components are too tightly coupled or you you’re in need of continuous intensive development occurring concurrently in multiple repos you should probably use git subtrees or just spare yourself the headache and drop everything into one big monorepo. (This depends, of course, on how big your codebase is.)

To read more about git subtree ­- see here.

The important thing to understand is that software integration is never totally painless and there is no perfect cure for the pain. Choose the solution that makes your life easier and assume the responsibilty of learning the accompanying workflow. As they say : “it’s not the tool ­ it’s how you use it.”

I’ll be happy to hear what you think, as this is a controversial issue with many different opinions flying around on the internet.

Keep delivering!

This blog first appeared on http://otomato.link/git-submodules-vs-googles-repo-tool/

Edureka has a specially curated course on Git and Github, co-created by industry experts.

Related Post:

‘Git’ting Ahead: Hacking Git and GitHub

Upcoming Batches For DevOps Certification Training Course
Course NameDateDetails
DevOps Certification Training Course

Class Starts on 20th April,2024

20th April

SAT&SUN (Weekend Batch)
View Details
DevOps Certification Training Course

Class Starts on 4th May,2024

4th May

SAT&SUN (Weekend Batch)
View Details
DevOps Certification Training Course

Class Starts on 20th May,2024

20th May

MON-FRI (Weekday Batch)
View Details
Comments
3 Comments
  • SacTiw says:

    Nice post!!!
    One question: You haven’t stated when to prefer using repo tool?

  • Philip Stefanov says:

    4. Hard to reconstruct old versions.
    repo tool can create snapshot manifest

    • EdurekaSupport says:

      Hey Philip, thanks for checking out our blog. In order to keep a backup of the project, you can create a repo manifest snapshot for the project. To explain this further:
      The purpose of Git is to manage a project, or a set of files, as they change over time. Git stores this information in a data structure called a repository.
      Repo is a repository management tool built on top of Git.It’s first purpose is to downloads files from multiple git repositories into your local working directory.
      A repo manifest describes the structure of a repo client; that is the directories that are visible and where they should be obtained from the git
      Manifests are inherently version controlled, since they are kept within a Git repository. Updates to manifests are automatically obtained by clients during `repo sync`.
      Hope this helps. Cheers!

Join the discussion

Browse Categories

webinar REGISTER FOR FREE WEBINAR
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP

Subscribe to our Newsletter, and get personalized recommendations.

image not found!
image not found!

Git Submodules vs. Google’s Repo Tool

edureka.co