welcome to the regolith docs

Regolith¶

Regolith is a content management system for software & research groups. Regolith creates and manages a database of people, publications, projects, proposals & grants, courses, and more! From this database, regolith is then able to:

Generate a group website,
Generate CVs and publication lists for the group members,
Act as a grade book for your courses, and more!

Databases may be file-based (JSON and YAML) or MongoDB-based.

Regolith is developed as a regro project

Example Sites¶

The following are some sample websites that are powered by regolith, even though building websites is just one of the many facets of this tool:

Installation¶

1. Make your first database¶

The quickest way to get started is to set up your first minimal database using a handy cookie cutter. These instructions use the command line and assume you know how to use the terminal/cmd prompt, and that you know how to install software from either Pypi using pip or Anaconda/Miniconda using conda. The instructions use the linux shell commands which should work on Mac and linux computers, and on windows if you are running from at Git Bash terminal (recommended) but will be slightly different (but still work) on a windows cmd terminal.

First install the cookiecutter package if you don’t already have it

$ conda install cookiecutter

or

$ pip install cookiecutter

Next, clone the GitHub repository with the handy beginning database template

$ git clone git@github.com/sbillinge/regolithdb-cookiecutter.git

to get it using SSH or

$ git clone https://github.com/sbillinge/regolithdb-cookiecutter.git

to get it using the HTTPS protocol (just use whichever works for you)

Make a note of the path to the resulting regolithdb-cookiecutter directory, (e.g., /c/Users/me/scratch/regolithdb-cookiecutter but yours will be different). This is not your database, this is just the template and will be removed shortly.

Next, in a new terminal, or in the same terminal, move to the directory where you want to install your own permanent database. For example, we like to create a directory off our home directory called dbs where we will keep all of our databases (believe me, once you start using Regolith you will want to make more and more)

$ cd ~        # takes you to your home directory
$ mkdir dbs   # creates the dbs directory if it is not already there
$ cd dbs      # change dir to the new dbs directory

Now by running cookiecutter your starting db will be built from the template

$ cookiecutter <path>/<to>/regolithdb-cookiecutter

The program will ask a series of questions and you can type responses. Take your time and answer the questions as accurately as possible, because you are already entering data into your database!

Here is an example, and the questions look like

$ cookiecutter ~/scratch/regolithdb-cookiecutter/
database_name [my-cv-db]:
my_first_name [Albert]: Simon
my_last_name [Einstein]: Billinge
id_for_me [aeinstein]: sbillinge
my_group_name [Einstein Group]: Billinge Group

and so on. If you just hit enter the cookie-cutter will use the default values and you will build a database for Einstein, but type the values you want in answer to each question to make your own.

If you make a mistake just type CTL^C and do it again. You may have to remove the directory if it has already been created, for example, $ rm -r my-cv-db. Watch what you type here and be careful not to remove something you care about by mistake!

When you are happy with your database setup, type

$ ls

which lists all the files in your current directory, and you should see a directory called my-cv-db or whatever you chose to call you database. OK, let’s go and look at our database. change directory into it and do a directory listing,

$ cd my-cv-db
$ ls

or open a file browser such as windows explorer and check out what is in there.

You will see a direcotry called db and a file called regolithrc.json. All of the collections in your database are in the db directory. The regolithrc.json contains a bunch of information that Regolith needs to run and do its business.

You can use the Regolith program to do many things with, and to, your database. But you must always run Regolith from a directory that contains a regolithrc.json file. Since you are in a directory that contains one, you can run Regolith from here, but first you have to install it….

2. install Regolith¶

Regolith packages are available from conda-forge and PyPI:

conda:

$ conda install -c conda-forge regolith

pip:

$ pip install regolith

The Regolith code is migrating quickly these days. If you prefer you can install from the GitHub repository mode and get the latest changes. In that case, clone the GitHub repository, change directory to the top level directory in that cloned repository where the setup.py file is. From inside your virtual environment, type

$ pip install regolith -e

which installs regolith in this environment in develop mode. In this mode, the version of Regolith you run will change each time you update from the repo leading to instability so be careful.

To check that your installation is working, let’s have Regolith make us a todo list from our database.

Make sure you are in a directory that contains a regolithrc.json file (which you should be, i.e., the top level directory of ~/dbs/my-cv-db, if you have been following these instructions) and type

$ regolith helper l_todos

and you should see something like

loading .\./db\todos.yml...
dumping todos...
usage: regolith helper [-h] [-s STATI [STATI ...]] [--short [SHORT]]
                       [-t TAGS [TAGS ...]] [-a ASSIGNED_TO]
                       [-b [ASSIGNED_BY]] [--date DATE]
                       [-f FILTER [FILTER ...]]
                       helper_target

positional arguments:
  helper_target         helper target to run. Currently valid targets are:
                        ['a_expense', 'a_grppub_readlist', 'a_manurev',
                        'a_presentation', 'a_projectum', 'a_proposal',
                        'a_proprev', 'a_todo', 'f_prum', 'f_todo',
                        'l_abstract', 'l_contacts', 'l_grants', 'l_members',
                        'l_milestones', 'l_progress', 'l_projecta', 'l_todo',
                        'u_contact', 'u_institution', 'u_logurl',
                        'u_milestone', 'u_todo', 'v_meetings', 'lister',
                        'makeappointments']

optional arguments:
  -h, --help            show this help message and exit
  -s STATI [STATI ...], --stati STATI [STATI ...]
                        Filter tasks with specific status from ['started',
                        'finished', 'cancelled', 'paused']. Default is
                        started.
  --short [SHORT]       Filter tasks with estimated duration <= 30 mins, but
                        if a number is specified, the duration of the filtered
                        tasks will be less than that number of minutes.
  -t TAGS [TAGS ...], --tags TAGS [TAGS ...]
                        Filter tasks by tags. Items are returned if they
                        contain any of the tags listed
  -a ASSIGNED_TO, --assigned-to ASSIGNED_TO
                        Filter tasks that are assigned to this user id.
                        Default id is saved in user.json.
  -b [ASSIGNED_BY], --assigned-by [ASSIGNED_BY]
                        Filter tasks that are assigned to other members by
                        this user id. Default id is saved in user.json.
  --date DATE           Enter a date such that the helper can calculate how
                        many days are left from that date to the due-date.
                        Default is today.
  -f FILTER [FILTER ...], --filter FILTER [FILTER ...]
                        Search this collection by giving key element pairs.
                        '-f description paper' will return tasks with
                        description containing 'paper'
If the indices are far from being in numerical order, please renumber them by running regolith helper u_todo -r
(index) action (days to due date|importance|expected duration (mins)|tags|assigned by)
--------------------------------------------------------------------------------
started:
(1) Do all the things to set up todos in regolith (59|3|60.0||None)
------------------------------
Tasks (decreasing priority going up)
------------------------------
2021-07-29(59 days): (1) Do all the things to set up todos in regolith (59|3|60.0||None)
------------------------------
Deadlines:
------------------------------

After all the help messages is your list of Todo items. There is just one item, Do all the things to set up todos in regolith.

OK, your Regolith is working. If it isn’t working, consider joining, browsing and posting questions to the regolith-users Google group.

Quick(ish) Start¶

OK, let’s use Regolith to build our cv. Why not. again, in a terminal navigate to the top level directory of your database (where the regolithrc.json file is). and type:

$ regolith build cv

Regolith will take information from the various collections in your database and build them into your academic cv according to a pre-determined template. The current template builds the cv using latex. If your computer has latex installed and Regolith can find it, your cv should appear as a pdf document in the directory my-cv-db/_build (or more generally <path>/<to>/<database_name>/_build). All your built documents will appear in the _build directory.

If you don’t have latex installed we can have Regolith build the latex source file for the cv but without trying to render it to PDF,

$ regolith build cv --no-pdf

The latex source is a text file that you will find in the _build directory and you can open it in a text editor. Even without latex installed you can render it by opening a free account at http://overleaf.com starting a new blank project, uploading the <filename>.tex and <filename>.bib files to that project and hitting the recompile button.

Whether it builds on your computer or on overleaf, it should look something like

If, for some reason, the publication list doesn’t render correctly, try running the latex command again. If you are going to do much building with regolith it is definitely recommended to install latex on your computer, such as MikTeX for windows (latex comes installed with many linux systems and is easily installed on IOS).

What Next?¶

You have not spent too much time entering data into your database yet, but you can already build a number of different things. Try building your resume ($ regolith build resume), your publication list ($ regolith build publist) and your presentation list ($ regolith build preslist). You can even build a web-page for your group ($ regolith build html). It will look pretty ugly until we set it up properly with a nice template, but all the content will be dynamically built from the latest info in your databases.

To see everything you can build, type $ regolith build --help. To build some of those things you will need more collections that are not in the cookie cutter template, for example, proposals and grants collections, but you get the idea.

So next we might want to work on those collections and start adding more data. This can be done in a couple of ways. Probably the simplest to begin with is just use a text editor or IDE like PyCharm. The yml files are yaml files, which is a human readable way of storing information that can be read and understood by python. Please read about it here if you are not familiar with it. However, briefly to get you started, it encodes whether information is part of a list or a dictionary by indentation and semantics. For example,

key:
  - list item
  - another list item

would be read by python as {"key": ["list item", "another list item"]}, and a collection consisting of a list of dictionaries would look like this in yaml:

id:
  - name: Arthur
    quest: To find the Holy Grail
    favorite_color: Blue
  - name: Sir Lancelot
    quest: To find the Holy Grail
    favorite_color: Green, no pink

Long story short, you can update your database by directly editing the file, and this is quick and easy when you get comfortable with the YAML syntax, but can be frustrating as you are learning it.

If you want to check what fields are allowed or required in a collection look at the Collections part of the docs, Collections, which are built from the Regolith schema (or directly look at the schema in schema.py). You can automatically check if your database edits are valid by running $ regolith validate.

Getting Help from Helpers¶

Regolith builders build documents, but there are a small but growing number of tools that either will run popular queries on the database and print the results to the terminal (“lister helpers” with l_ prefixes – you already used one, it was the lister helper that builds your todo list).

There are also helpers that help you to add documents to your database collections. These are “adder helpers” with a_ prefixes. An important adder helper is a_todo helper that will add a todo item to your list.

“Updater helpers” will update existing entries in your databases and have prefix u_.

An important special kind of updater helper is a “finish helper” that will mark something as finished (and give it a finish date). So when you do that pesky 15th todo item on your todo list, run regolith helper f_todo -i 15 to finish it.

That is a lot of typing to finish a todo, so consider setting up an alias in the config file for your terminal program (my terminals run bash so I put the alias in the .bashrc file in my home directory ($ cd ~ to get there). With this alias I just type rhlt 15 to finish that 15th todo item.

To explore what helpers are there so you can play with them, type

$ regolith helper

and hit return. It will return a list of available helpers, e.g.,

$ regolith helper
    usage: regolith helper [-h] helper_target
    regolith helper: error: the following arguments are required: helper_target
    usage: regolith helper [-h] helper_target

    positional arguments:
      helper_target  helper target to run. Currently valid targets are:
                     ['a_expense', 'a_grppub_readlist', 'a_manurev',
                     'a_presentation', 'a_projectum', 'a_proposal', 'a_proprev',
                     'a_todo', 'f_prum', 'f_todo', 'l_abstract', 'l_contacts',
                     'l_grants', 'l_members', 'l_milestones', 'l_progress',
                     'l_projecta', 'l_todo', 'u_contact', 'u_institution',
                     'u_logurl', 'u_milestone', 'u_todo', 'v_meetings', 'lister',
                     'makeappointments']

then if you want to know how to use any of the helpers type

$ regolith helper <helper target>

and hit return, e.g.,

$ regolith helper l_contacts
usage: regolith helper [-h] [-v] [-n NAME] [-i INST] [-d DATE] [-r RANGE]
                       [-o NOTES] [-f FILTER [FILTER ...]]
                       [-k KEYS [KEYS ...]]
                       helper_target run
regolith helper: error: the following arguments are required: run
usage: regolith helper [-h] [-v] [-n NAME] [-i INST] [-d DATE] [-r RANGE]
                       [-o NOTES] [-f FILTER [FILTER ...]]
                       [-k KEYS [KEYS ...]]
                       helper_target run

positional arguments:
  helper_target         helper target to run. Currently valid targets are:
                        ['a_expense', 'a_grppub_readlist', 'a_manurev',
                        'a_presentation', 'a_projectum', 'a_proposal',
                        'a_proprev', 'a_todo', 'f_prum', 'f_todo',
                        'l_abstract', 'l_contacts', 'l_grants', 'l_members',
                        'l_milestones', 'l_progress', 'l_projecta', 'l_todo',
                        'u_contact', 'u_institution', 'u_logurl',
                        'u_milestone', 'u_todo', 'v_meetings', 'lister',
                        'makeappointments']
  run                   run the lister. To see allowed optional arguments,
                        type "regolith helper l_contacts".

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         Increases the verbosity of the output.
  -n NAME, --name NAME  name or name fragment (single argument only) to use to
                        find contacts.
  -i INST, --inst INST  institution or an institution fragment (single
                        argument only) to use to find contacts.
  -d DATE, --date DATE  approximate date in ISO format (YYYY-MM-DD)
                        corresponding to when the contact was entered in the
                        database. Comes with a default range of 4 months
                        centered around the date; change range using --range
                        argument.
  -r RANGE, --range RANGE
                        range (in months) centered around date d specified by
                        --date, i.e. (d +/- r/2).
  -o NOTES, --notes NOTES
                        fragment (single argument only) to be found in the
                        notes section of a contact.
  -f FILTER [FILTER ...], --filter FILTER [FILTER ...]
                        Search this collection by giving key element pairs.
  -k KEYS [KEYS ...], --keys KEYS [KEYS ...]
                        Specify what keys to return values from when running
                        --filter. If no argument is given the default is just
                        the id.

you then would rerun the command giving all required, and any optional, command line arguments. e.g.,

$ regolith helper l_contacts run --name frank -v

will return all contacts in the contacts collection where frank appears anywhere in the name, such as Frankie Valli, Baron von Frankenstein and Anne Frank (if they are in your contacts). The -v command stands for verbose which means more information is returned than if you don’t type -v. You can try it now:

$ regolith helper l_contacts run -n auth -v

Setting up Gitlab repository information for API requests¶

Some helpers have features that make API requests to GitLab (or GitHub). For example, the a_presentation helper has a functionality that creates a repository in a designated GitLab group. In order to use these features, the target repository information needs to be defined in your configuration files (regolithrc.json, user.json).

Setting up Destination Repo Information¶

The designated repository information should be defined in regolithrc.json in the directory in which you are running the helper. Create a collection of repository targets designated as repos (see below for an example). according to the following pattern. We will use as an example an entry that will allow a_presentation to successfully create a repository in a group called talks on a GitLab instance.

a_presentation looks for a rep with the entry _id with value "talk_repo".

“repos”:[

{“_id”: “talk_repo”, # a_presentation looks for the entry with this ID

“params”: {“namespace_id”: “35”, # These params are handed to the API post request.
“initialize_with_readme”: “True” # “name” is also needed but a_presentation generates that automatially },

“url”: “https://gitlab.example.com”, # The URL of the main GitLab/GitHub instance “api_route”: “/api/v4/projects/”, # This is the route to the REST-API. The value

# shown here is correct for GitLab at the time of writing

“namespace_name”: “talks” # the name of group/org which corresponds to the namespace_id above.

}, {

“_id”: “another_example_repo”, […]

}

]

The namespace ID is the repository’s group ID which can be found on the target repository’s main page. The url and api_route should be in the format above, including the dashes.

For more information on the required request info, or to see a list of additional attributes that can also be defined in the request (e.g. initialize_with_readme, description, etc.), see GitHub or GitLab API documentation, e.g., for GitLab the GitLab docs. (Note that additional attributes can be defined under params, where needed.)

Setting up your Private Access Token¶

Your personal/private API request token should be defined in user.json, which can be found in your ~/.config directory. Similarly, define a distinct ID for each private token. For example, to create a repo in GitLab, you should define your authentication token with the ID, "gitlab_private_token":

[
    {
        "_id": "gitlab_private_token",
        "token": "<private-token>"
    },
    {
        "_id": "example_token",
        [...]
    }
]

To learn more about creating a personal access token, refer to the Gitlab docs. Note that your personal access token should have the api scope enabled in order to make a successful request.

To change the target directory, you can change the parameters (or IDs) in the function create_repo(destination_id, token_info_id, rc) in a_presentationhelper.py to the IDs of your desired repo info and corresponding token.

Setting up GitHub repository information for API requests¶

Using the filter capabilities in the helpers¶

Most helpers have a filter field. This allows you to filter the relevant collection before running the helper functionality.

The logic of filter is the following. A document will be valid if the value of key contains value for all keys and values using AND logic.

As an example, if we consider filtering in the l_milestones helper we will get the following behavior. l_milestones operates on the projecta collection, so the filter will be applied to this collection. If you specify --filter lead voe it will return all documents where voe appears in the value for the lead field (e.g., if there is someone with an id of carvoe and another person with and id of voedemort the filter will return all the documents where either of these people are lead).

If you then select current and verbose the helper will do the normal thing of returning in verbose form the current milestones, but it will do it on the filtered collection.

A slight gotcha is that since filter uses “in” in its logic, if the type of the key value is a string it will find all strings that contain that fragment, as above, but if the type of the key value is a list it will return documents where the specified value is in the list, so --filter group_member voe will return all the documents where voe is listed as a group member, but it won’t return any documents where carvoe or voedemort are listed as a group member.

The filter uses AND logic and operates such that --filter lead voe grants mygrant21 status finished will return all prums that are led by carvoe or voedemort that acknowledge the mygrant21 grant and are finished. Actually, similar behavior can be obtained also by selecting --lead voe --stati finished --filter grants mygrant21

unfortunately the filter function does not currently recurse, so it will only operate on top level key-value pairs where the type of the value is a string or a list or a tuple.

Backing up and protecting your work¶

Now you have started saving your precious life’s work in your regolith database you better start protecting it and backing it up. One low overhead approach for this is simply to set up your database directory to be backed up remotely as a Google drive or Dropbox synced directory, for example.

However, Regolith is set up to work with git and GitHub and this is a powerful option if you are comfortable with it. This gets more useful when you want to start sharing databases with group members, for example, using GitHub access rights. It is also possible to make sure people’s edits to the database won’t break things by setting up continuous integration (CI) that runs some validation and builders and makes sure they don’t crash before the edits are accepted. This is much more advanced usage which you should save for later.

To get started with the GitHub option, the next thing to do is to turn your database directory on your filesystem into a git repository and link it to a repository on your personal space on GitHub (you will need a GitHub account). You can make that repo private so the world cannot see your todo list, or public so that the world can see the web-page you build from it. We will get back to this later, but Regolith will build collections from across databases, so you can have parts of your people collection private and other parts public. Depending which regolithrc.json file you use to build with, you can pull from the public, or private, or both parts. Again, this is a peep to the future. For now, let’s assume you just want to back up and keep versions of you private database. You will make a repository on your personal GitHub account and synchronize your local database with this repo. Instructions for doing this are here

Once you get everything set up you will want to periodically (meaning frequently) type

$ git commit -a -m "my commit message"
$ git push

This will add, commit and push all files that git is tracking that have been updated locally. If you add a new file to the repository and want it in the GitHub backup, you will have to explicitly add it before committing,

$ git add my_new_file.py
$ git commit -m "an even more informative commit message"

$ git commit commits (i.e., checks in) to the git database (yes, git, like you now, is using a database backend) everything that has been added, or staged, for commit. $ git commit -a automatically adds all files that git is tracking (have been previously committed in the past) that have been edited and then commits them.

They are now safely captured in the git database and you can retrieve them later if you accidentally delete your personal database or mess it up some other way. But this version of the git database is still stored on your local computer, so if you spill coffee on your computer, you may lose everything. $ git push pushes all these updates to a remote computer on the internet at the GitHub headquarters. Git and GitHub form a wonderful but complicated infrastructure, it is well worth getting to know how to use them well. For now, we have used it to secure your precious database. Remember to make frequent pushes.

OK, you are started with your Regolith database. Go play. Regolith can do many more complicated things to help with administering your research group, or whatever you are working on. We will continue to add tutorials below explaining some of these things, so check back from time to time. And remember join and to ask questions at the regolith-users Google group. They will get answered.

Tutorials¶

Tutorials

Run Control¶

Database Collections¶

Collections are the regolith (and mongo) abstraction for tables. Entries (or rows) in a collection must follow the schema defined below. In general, the following notions hold:

An entry is a dictionary with string keys.
Each entry must contain a unique identifier. This is called "_id" in JSON and Mongo, and is simply the top-level key in YAML.
A collection is a list of entries that follow the same schema.

Not all regolith actions will use every collection type. It is common for regolith projects to just use some of the collections below. For example, building a group website will use different collections than managing students and grades in a course! With these points in mind, feel free to dive into the databases below!

Regolith API¶

For those who want to dive deeper into the library itself.

Regolith API

Regolith Commands¶

Shell commmands for regolith

Regolith Commands
- add
- app
- build
- classlist
- deploy
- email
- fs-to-mongo
- grade
- helper
- ingest
- json-to-yaml
- mongo-to-fs
- rc
- store
- validate
- yaml-to-json

Database Maintenance¶

Database Maintenance