Skip to article frontmatterSkip to article content
GitHub Logo

GitHub Repositories

Overview:

  1. Explore GitHub Repositories

Prerequisites

ConceptsImportanceNotes
What is GitHub?Necessary
  • Time to learn: 15 minutes

What is a GitHub repository?

GitHub gives the following explanation of a repository:

A repository is usually used to organize a single project. Repositories can contain folders and files, images, videos, spreadsheets, and data sets -- anything your project needs. Often, repositories include a README file, a file with information about your project. GitHub makes it easy to add one at the same time you create your new repository. It also offers other common options such as a license file.

In short, it is a collection of files. Each GitHub repository has an owner, which could be an individual or an organization. Repositories can also be set to public or private, determining who can see and interact with it. While a repository can simply store files, GitHub is designed with collaboration in mind. Three key collaborative tools in GitHub are:

  1. Issues: report a bug, plan improvements, or provide feedback to others working on the repository.
  2. Discussions: post ideas or other conversations that are not as specific or actionable as an Issue.
  3. Pull requests: We will go into the specifics later, but a Pull request allows a user to propose a change to any of the files within a repository.

What are some examples of repositories?

All of the Python packages covered (e.g. Numpy and Xarray) in this Foundations book have associated GitHub repositories, as well as Python itself:

NumPy GitHub

Xarray GitHub

Python GitHub

As you can see by the recent timestamps, these repositories are actively changing; this reflects the adaptability of the open-source software ecosystem surrounding Python.

Another example is this project’s Pythia Foundations repository, on which this tutorial is stored. It is owned by the Project Pythia organization. This organization also owns several other repositories that store the files needed to generate https://projectpythia.org/, among other things.

GitHub’s distributed repositories

Finally, we introduce an important concept that is vital to your understanding when working with GitHub. It is the source of GitHub’s power, as well as much of its complexity. GitHub repositories are distributed; in the general case, there is more than one repository for any project. In fact, repositories can come and go at any time, created and deleted as need dictates. Creating new repositories from existing ones, synchronizing them, and managing them are the topics of later sections. For now, it is only important to understand that for a GitHub-managed project, there is typically one “official” repository, often called the “upstream” repository, and it lives on GitHub.com. There may be any number of copies of the “official” repository, known as forks (or origins, if it is owned by you), that also reside on GitHub.com. Repos that are hosted on GitHub.com are referred to as remotes. In addition to the remotes, there may be one or more copies of the remotes on your desktop or laptop computer that are referred to as locals. A conceptual diagram of the various repos is shown in the image below.

GitHub repositories

Things to try:

  1. Browse the NumPy, Xarray, Python, and Pythia Foundations repos.
  2. Browse the organizations (e.g., Pydata) which house the repos within.
  3. Check out GitHub’s “Create a repo” tutorial to learn how to create your own repository!

Summary

  • GitHub’s Repositories are collections of files.
  • Issues, Discussions, and Pull requests can be used to collaborate within a repository.
  • A GitHub Organization contains Repositories.

What’s Next?

We will further explore Issues and Discussions.

References

  1. GitHub’s quickstart guide