Regolith API

For some of you who want to know the gritty details.

Subpackages

Submodules

regolith.app module

Flask app for looking at information in regolith.

regolith.app.collection_page(dbname, collname)[source]
regolith.app.root()[source]
regolith.app.shutdown()[source]
regolith.app.shutdown_server()[source]

regolith.broker module

API for accessing the metadata and file storage

class regolith.broker.Broker(rc=RunControl(builddir='_build', database=None, force=False, mongodbpath=<property object>, user_config='/home/runner/.config/regolith/user.json'))[source]

Bases: object

Interface to the database and file storage systems

Examples

>>> # Load the db
>>> db = Broker.from_rc()
>>> # Get a docment from the broker
>>> ergs =db['group']['ergs']
>>> # Store a file
>>> db.add_file(ergs, 'myfile', '/path/to/file/hello.txt')
>>> # Get a file from the store
>>> path = db.get_file_path(ergs, 'myfile')
add_file(document, name, filepath)[source]

Add a file to a document in a collection.

Parameters:
  • document (dict) – The document to add the file to

  • name (str) – Name of the reference to the file

  • filepath (str) – Location of the file on local disk

classmethod from_rc(rc_file='regolithrc.json')[source]

Return a Broker instance

get_file_path(document, name)[source]

Get a file from the file storage associated with the document and name

Parameters:
  • document (dict) – The document which stores the reference to the file

  • name (str) – The name of the file stored (note that this can be different from the filename itself)

Returns:

path – The file path, if not in the storage None

Return type:

str or None

regolith.broker.load_db(rc_file='regolithrc.json')[source]

Create a Broker instance from an rc file

regolith.builder module

Generic builder.

regolith.builder.builder(btype, rc)[source]

Returns builder of the appropriate type.

regolith.chained_db module

Base class for chaining DBs

ChainDBSingleton Copyright 2015-2016, the xonsh developers

class regolith.chained_db.ChainDB(*maps)[source]

Bases: ChainMap

A ChainMap who’s _getitem__ returns either a ChainDB or the result

class regolith.chained_db.ChainDBSingleton[source]

Bases: object

Singleton for representing when no default value is given.

regolith.classlist module

Classlist implementation

class regolith.classlist.UscHtmlParser[source]

Bases: HTMLParser

Class for parsing data from USC-formatted HTML.

handle_data(data)[source]
handle_endtag(tag)[source]
handle_starttag(tag, attrs)[source]
should_handle()[source]
regolith.classlist.add_students_to_course(students, rc)[source]

Add students to the course listed

regolith.classlist.add_students_to_db(students, rc)[source]

Add new students to the student directory.

regolith.classlist.load_csv(filename, format='columbia')[source]

Returns students as a list of dicts from a csv from Columbia Courseworks

regolith.classlist.load_json(filename)[source]

Returns students as a list of dicts from JSON file.

regolith.classlist.load_usc(filename)[source]

Returns students as a list of dicts from an HTML file obtainted from the University of South Carolina.

regolith.classlist.register(rc)[source]

Entry point for registering classes.

regolith.client_manager module

class regolith.client_manager.ClientManager(databases, rc)[source]

Bases: object

Client wrapper that allows for multiple backend clients to be used in parallel with one chained DB

all_documents(collname, copy=True)[source]

Returns an iteratable over all documents in a collection.

close()[source]

Closes the database connections.

collection_names(dbname, include_system_collections=True)[source]

Returns the collaction names for a database.

delete_one(dbname, collname, doc)[source]

Removes a single document from a collection

dump_database(db)[source]
export_database(db: dict)[source]
find_one(dbname, collname, filter)[source]

Finds the first document matching filter.

import_database(db: dict)[source]
insert_many(dbname, collname, docs)[source]

Inserts many documents into a database/collection.

insert_one(dbname, collname, doc)[source]

Inserts one document to a database/collection.

keys()[source]
load_database(db)[source]
open()[source]

Opens the database connections

update_one(dbname, collname, filter, update, **kwargs)[source]

Updates one document.

regolith.commands module

Implementation of commands for command line.

regolith.commands.add_cmd(rc)[source]

Adds documents to a collection in a database.

regolith.commands.app(rc)[source]

Runs flask app

regolith.commands.build(rc)[source]

Builds all of the build targets

regolith.commands.build_db_check(rc)[source]

Checks which DBs a builder needs

regolith.commands.classlist(rc)[source]

Sets values for the class list.

regolith.commands.deploy(rc)[source]

Deploys all of the deployment targets.

regolith.commands.fs_to_mongo(rc: RunControl) None[source]

Convert database collection from filesystem to mongo db.

Parameters:

rc (RunControl) – The RunControl. The mongo client will be created according to ‘mongodbpath’ in it. The databases will be loaded according to the ‘databases’ in it.

regolith.commands.grade(rc)[source]

Runs flask grading app

regolith.commands.helper(rc)[source]

Runs the helper targets

regolith.commands.helper_db_check(rc)[source]

Checks which DBs a builder needs

regolith.commands.ingest(rc)[source]

Ingests a foreign resource into a database.

regolith.commands.json_to_yaml(rc)[source]

Converts JSON to YAML

regolith.commands.mongo_to_fs(rc: RunControl) None[source]

Convert database collection from filesystem to mongo db.

Parameters:

rc (RunControl) – The RunControl. The mongo client will be created according to ‘mongodbpath’ in it. The databases will be loaded according to the ‘databases’ in it.

regolith.commands.validate(rc)[source]

Validate the combined database against the schemas

regolith.commands.yaml_to_json(rc)[source]

Converts YAML to JSON

regolith.dates module

Date based tools

regolith.dates.convert_doc_iso_to_date(doc)[source]
regolith.dates.date_to_float(y, m, d=0)[source]

Converts years / months / days to a float, eg 2015.0818 is August 18th 2015.

regolith.dates.day_to_str_int(d)[source]

Converts a day to an int form, str type, with a leading zero

regolith.dates.find_gaps_overlaps(dateslist, overlaps_ok=False)[source]

Find whether there is a gap or an overlap in a list of date-ranges

Parameters:
  • dateslist (list of tuples of datetime.date objects) – The list of date-ranges.

  • overlaps_ok (bool) – Returns false if there are gaps but true if there are overlaps but no gaps

Return type:

True if there are no gaps or overlaps else False

regolith.dates.get_dates(thing, date_field_prefix=None)[source]

given a dict like thing, return the date items

Parameters:
  • thing (dict) – the dict that contains the dates

  • date_field_prefix (string (optional)) – the prefix to look for before the date parameter. For example given “submission” the function will search for submission_day, submission_year, etc.

Returns:

  • dict containing datetime.date objects for valid begin_date end_date and date, and

  • prefix_date if a prefix string was passed. Missing and empty dates and

  • date items that contain the string ‘tbd’ are not returned. If no valid

  • date items are found, an empty dict is returned

Description

If “begin_date”, “end_date” or “date” values are found, if these are are in an ISO format string they will be converted to datetime.date objects and returned in the dictionary under keys of the same name. A specified date will override any date built from year/month/day data.

If they are not found the function will look for begin_year, end_year and year.

If “year”, “month” and “day” are found the function will return these in the “date” field and begin_date and end_date will match the “date” field. If only a “year” is found, then the date attribute will be none but the begin and end dates will be the first and last day of that respective year.

If year is found but no month or day are found the function will return begin_date and end_date with the beginning and the end of the given year/month. The returned date will be None.

If end_year is found, the end month and end day are missing they are set to 12 and 31, respectively

If begin_year is found, the begin month and begin day are missing they are set to 1 and 1, respectively

If a date field prefix is passed in this function will search for prefix_year as well as prefix_month, prefix_day, and prefix_date. For example, if the prefix string passed in is “submitted” then this function will look for submitted_date instead of just date.

Examples

>>> get_dates({'submission_day': 10, 'submission_year': 2020, 'submission_month': 'Feb'}, "submission")

This would return a dictionary consisting of the begin date, end, date, and date for the given input. Instead of searching for “day” in the thing, it would search for “submission_day” since a prefix was given. The following dictionary is returned (note that a submission_date and a date key are in the dictionary): {‘begin_date’: datetime.date(2020, 2, 10),

‘end_date’: datetime.date(2020, 2, 10), ‘submission_date’: datetime.date(2020, 2, 10), ‘date’: datetime.date(2020, 2, 10)

}

>>> get_dates({'begin_year': 2019, 'end_year': 2020, 'end_month': 'Feb'})

This will return a dictionary consisting of the begin date, end date, and date for the given input. Because no prefix string was passed in, the function will search for “date” in the input instead of prefix_input. The following dictionary is returned: {‘begin_date’: datetime.date(2019, 1, 1),

‘end_date’: datetime.date(2020, 2, 29),

}

regolith.dates.get_due_date(thing)[source]
Parameters:

thing (dict) – gets the field named ‘due_date’ from doc and ensurese it is a datetime.date object

Return type:

The due date as a datetime.date object

regolith.dates.has_finished(thing, now=None)[source]

given a thing with dates, returns true if the thing has finished

Parameters:
  • thing (dict) – the thing that we want to know whether or not it has finished

  • now (datetime.date object) – a date for now. If it is None it uses the current date. Default is None

Return type:

True if the thing has finished and false otherwise

regolith.dates.has_started(thing, now=None)[source]

given a thing with dates, returns true if the thing has started

Parameters:
  • thing (dict) – the thing that we want to know whether or not it is has started

  • now (datetime.date object) – a date for now. If it is None it uses the current date. Default is None

Return type:

True if the thing has started and false otherwise

regolith.dates.is_after(thing, now=None)[source]

given a thing with a date, returns true if the thing is after the input date

Parameters:
  • thing (dict) – the thing that we want to know whether or not is after a date

  • now (datetime.date object) – a date for now. If it is None it uses the current date. Default is None

Return type:

True if the thing is after the date

regolith.dates.is_before(thing, now=None)[source]

given a thing with a date, returns true if the thing is before the input date

Parameters:
  • thing (dict) – the thing that we want to know whether or not is before a date

  • now (datetime.date object) – a date for now. If it is None it uses the current date. Default is None

Return type:

True if the thing is before the date

regolith.dates.is_between(thing, start=None, end=None)[source]

given a thing with a date, returns true if the thing is between the start and end date

Parameters:
  • thing (dict) – the thing that we want to know whether or not is after a date

  • start (datetime.date object) – a date for the start. If it is None it uses the current date. Default is None

  • end (datetime.date object) – a date for the end. If it is None it uses the current date. Default is None

Return type:

True if the thing is between the start and end

regolith.dates.is_current(thing, now=None)[source]

given a thing with dates, returns true if the thing is current looks for begin_ and end_ daty things (date, year, month, day), or just the daty things themselves. e.g., begin_date, end_month, month, and so on.

Parameters:
  • thing (dict) – the thing that we want to know whether or not it is current

  • now (datetime.date object) – a date for now. If it is None it uses the current date. Default is None

Return type:

True if the thing is current and false otherwise

regolith.dates.last_day(year, month)[source]

Returns the last day of the month for the month given

Parameters:
  • year (integer) – the year that the month is in

  • month (integer or string) – the month. if a string should be resolvable using regolith month_to_int

Return type:

The last day of that month

regolith.dates.month_to_int(m)[source]

Converts a month to an integer.

regolith.dates.month_to_str_int(m)[source]

Converts a month to an int form, str type, with a leading zero

regolith.deploy module

Helps deploy what we have built.

regolith.deploy.deploy(rc, name, url, src='html', dst=None)[source]

Deploys a target

regolith.deploy.deploy_git(rc, name, url, src='html', dst=None)[source]

Loads a git database

regolith.deploy.deploy_hg(rc, name, url, src='html', dst=None)[source]

Loads an hg database

regolith.deploy.ensure_deploy_dir(rc)[source]

Ensure deployment dir is on rc and physically exists.

regolith.emailer module

Emails people via SMTP

regolith.emailer.attach_pdf(filename)[source]
regolith.emailer.attach_txt(filename)[source]
regolith.emailer.class_email(rc)[source]

Sends an email to all students in the active classes.

regolith.emailer.emailer(rc)[source]

Constructs and sends out emails

regolith.emailer.grade_email(rc)[source]

Sends grade report emails to students.

regolith.emailer.list_email(rc)[source]

List class emails

regolith.emailer.make_message(rc, to, subject='', body='', attachments=())[source]

Creates an email following the approriate format. The body kwarg may be a string of restructured text. Attachements is a list of filenames to attach.

regolith.emailer.test_email(rc)[source]

Sends a test email from regolith.

regolith.fsclient module

Contains a client database backed by the file system.

class regolith.fsclient.DelayedKeyboardInterrupt[source]

Bases: object

handler(sig, frame)[source]
class regolith.fsclient.FileSystemClient(rc)[source]

Bases: object

A client database backed by the file system.

all_documents(collname, copy=True)[source]

Returns an iteratable over all documents in a collection.

close()[source]
collection_names(dbname, include_system_collections=True)[source]

Returns the collaction names for a database.

delete_one(dbname, collname, doc)[source]

Removes a single document from a collection

dump_database(db)[source]

Dumps a database back to the filesystem.

dump_json(docs, collname, dbpath)[source]

Dumps json docs and returns filename

dump_yaml(docs, collname, dbpath)[source]

Dumps json docs and returns filename

find_one(dbname, collname, filter)[source]

Finds the first document matching filter.

insert_many(dbname, collname, docs)[source]

Inserts many documents into a database/collection.

insert_one(dbname, collname, doc)[source]

Inserts one document to a database/collection.

is_alive()[source]
keys()[source]
load_database(db)[source]

Loads a database.

load_json(db, dbpath)[source]

Loads the JSON part of a database.

load_yaml(db, dbpath)[source]

Loads the YAML part of a database.

open()[source]
update_one(dbname, collname, filter, update, **kwargs)[source]

Updates one document.

regolith.fsclient.date_encoder(obj)[source]
regolith.fsclient.dump_json(filename, docs, date_handler=None)[source]

Dumps a dict of documents into a file.

regolith.fsclient.dump_yaml(filename, docs, inst=None)[source]

Dumps a dict of documents into a file.

regolith.fsclient.json_to_yaml(inp, out)[source]

Converts a JSON file to a YAML one.

regolith.fsclient.load_json(filename)[source]

Loads a JSON file and returns a dict of its documents.

regolith.fsclient.load_yaml(filename, return_inst=False, loader=None)[source]

Loads a YAML file and returns a dict of its documents.

regolith.fsclient.yaml_to_json(inp, out, loader=None)[source]

Converts a YAML file to a JSON one.

regolith.grader module

Flask app for grading regolith.

regolith.grader.form_to_grade_assignment(form)[source]

Creates a grade dict from an assignment form.

regolith.grader.form_to_grade_row(form)[source]

Creates a grade dict from a row form.

regolith.grader.insert_grade(grade, form, rc)[source]

Inserts a grade into the database.

regolith.grader.root()[source]
regolith.grader.shutdown()[source]
regolith.grader.shutdown_server()[source]

regolith.helper module

Generic builder.

regolith.helper.helpr(btype, rc)[source]

Returns helper of the appropriate type.

regolith.helper_connect_main module

The main CLI for regolith

regolith.helper_connect_main.create_parser(inputs)[source]
regolith.helper_connect_main.create_top_level_parser()[source]
regolith.helper_connect_main.main(args=None)[source]

regolith.helper_gui_main module

The main CLI for regolith

regolith.helper_gui_main.main(args=None)[source]

regolith.interact module

regolith.main module

The main CLI for regolith

regolith.main.create_parser()[source]
regolith.main.main(args=None)[source]

regolith.mongoclient module

Client interface for MongoDB. Maintained such that only pymongo is necessary when using helper/builders, and additional command-line tools are necessary to install for maintenance tasks, such as fs-to-mongo.

class regolith.mongoclient.MongoClient(rc)[source]

Bases: object

A client backed by MongoDB.

The mongodb server will be automatically opened when the client is initiated.

Variables:
  • rc (RunControl) – The RunControl. It may include the ‘mongohost’ attribute to initiate the client.

  • client (MongoClient) – The mongo client. It is initiate from the ‘mongohost’ attribute if it exists in rc. Otherwise, it will be initiated from the ‘localhost’.

  • proc (Popen) – The Popen of ‘mongod –dpath <mongodbpath>’. The ‘mongodbpath’ is from rc.

all_documents(collname, copy=True)[source]

Returns an iterable over all documents in a collection.

close()[source]

Closes the database connection.

collection_names(dbname, include_system_collections=True)[source]

Returns the collection names for the database name.

delete_one(dbname, collname, doc)[source]

Removes a single document from a collection

dump_database(db)[source]

Dumps a database dict via mongoexport.

export_database(db: dict)[source]

Exports the database from mongo backend to the filesystem.

Parameters:

db (dict) – The dictionary of data base information, such as ‘name’.

find_one(dbname, collname, filter)[source]

Finds the first document matching filter.

import_database(db: dict)[source]

Import the database from filesystem to the mongo backend.

Parameters:

db (dict) – The dictionary of data base information, such as ‘name’.

insert_many(dbname, collname, docs)[source]

Inserts many documents into a database/collection.

insert_one(dbname, collname, doc)[source]

Inserts one document to a database/collection.

is_alive()[source]

Returns whether or not the client is alive and available to send/receive data.

keys()[source]
load_database(db: dict)[source]

Load the database information from mongo database.

It populate the ‘dbs’ attribute with a dictionary like {database: {collection: docs_dict}}.

Parameters:

db (dict) – The dictionary of data base information, such as ‘name’.

open()[source]

Opens the database client

update_one(dbname, collname, filter, update, **kwargs)[source]

Updates one document.

regolith.mongoclient.bson_cleanup(doc: dict)[source]

This method should be used prior to updating or adding a document to a collection in mongo. Specifically, this replaces all periods in keys and _id value with a blank, and changes datetime.date to an iso string. It does so recursively for nested dictionaries.

Parameters:

doc

Return type:

doc

regolith.mongoclient.doc_cleanup(doc: dict)[source]
regolith.mongoclient.export_json(collection: str, dbpath: str, dbname: str, host: str | None = None, uri: str | None = None) None[source]
regolith.mongoclient.import_jsons(dbpath: str, dbname: str, host: str | None = None, uri: str | None = None) None[source]

Import the json files to mongo db.

Each json file will be a collection in the database. The _id will be the same as it is in the json file.

Parameters:
  • dbpath (str) – The path to the db folder.

  • dbname (str) – The name of the database in mongo.

  • host (str) – The hostname or IP address or Unix domain socket path of a single mongod or mongos instance to connect to, or a mongodb URI, or a list of hostnames / mongodb URIs.

  • uri (str) – Specify a resolvable URI connection string (enclose in quotes) to connect to the MongoDB deployment.

regolith.mongoclient.import_yamls(dbpath: str, dbname: str, host: str | None = None, uri: str | None = None) None[source]

Import the yaml files to mongo db.

Each yaml file will be a collection in the database. The _id will be the id_key for each doc in the yaml file.

Parameters:
  • dbpath (str) – The path to the db folder.

  • dbname (str) – The name of the database in mongo.

  • host (str) – The hostname or IP address or Unix domain socket path of a single mongod or mongos instance to connect to, or a mongodb URI, or a list of hostnames / mongodb URIs.

  • uri (str) – Specify a resolvable URI connection string (enclose in quotes) to connect to the MongoDB deployment.

regolith.mongoclient.load_mongo_col(col: Collection) dict[source]

Load the pymongo collection to a dictionary.

In the dictionary. The key will be the ‘_id’ and in each value which is a dictionary there will also be a key ‘_id’ so that the structure will be the same as the filesystem collection.

Parameters:

col (Collection) – The mongodb collection.

Returns:

dct – A dictionary with all the info in the collection.

Return type:

dict

regolith.runcontrol module

Run Control object for regolith

regolith.runcontrol.NotSpecified = NotSpecified

A helper class singleton for run control meaning that a ‘real’ value has not been given.

class regolith.runcontrol.NotSpecifiedType[source]

Bases: object

A helper class singleton for run control meaning that a ‘real’ value has not been given.

class regolith.runcontrol.RunControl(_updaters=None, _validators=None, **kwargs)[source]

Bases: object

A composable configuration class. Unlike argparse.Namespace, this keeps the object dictionary (__dict__) separate from the run control attributes dictionary (_dict).

regolith.runcontrol.connect_db(rc, colls=None)[source]

Load up the db’s

Parameters:
  • rc – The runcontrol instance

  • colls – The list of collections that should be loaded

Returns:

  • chained_db – The chained databases in the form of a document

  • dbs – The databases in the form of a runcontrol client

regolith.runcontrol.ensuredirs(f)[source]

For a file path, ensure that its directory path exists.

regolith.runcontrol.exec_file(filename, glb=None, loc=None)[source]

A function equivalent to the Python 2.x execfile statement.

regolith.runcontrol.filter_databases(rc)[source]

Filters the databases list down to only the ones we need, in place.

regolith.runcontrol.flatten(iterable)[source]

Generator which returns flattened version of nested sequences.

regolith.runcontrol.ishashable(x)[source]

Tests if a value is hashable.

regolith.runcontrol.load_json_rcfile(fname)[source]

Loads a JSON run control file.

regolith.runcontrol.load_rcfile(fname)[source]

Loads a run control file.

regolith.runcontrol.touch(filename)[source]

Opens a file and updates the mtime, like the posix command of the same name.

regolith.runcontrol.warn_forbidden_name(forname, inname=None, rename=None)[source]

Warns the user that a forbidden name has been found.

regolith.schemas module

Database schemas, examples, and tools

class regolith.schemas.NoDescriptionValidator(*args, **kwargs)[source]

Bases: Validator

Validator class. Normalizes and/or validates any mapping against a validation-schema which is provided as an argument at class instantiation or upon calling the validate(), validated() or normalized() method. An instance itself is callable and executes a validation.

All instantiation parameters are optional.

There are the introspective properties types, validators, coercers, default_setters, rules, normalization_rules and validation_rules.

The attributes reflecting the available rules are assembled considering constraints that are defined in the docstrings of rules’ methods and is effectively used as validation schema for schema.

Parameters:
  • schema (any mapping) – See schema. Defaults to None.

  • ignore_none_values (bool) – See ignore_none_values. Defaults to False.

  • allow_unknown (bool or any mapping) – See allow_unknown. Defaults to False.

  • require_all (bool) – See require_all. Defaults to False.

  • purge_unknown (bool) – See purge_unknown. Defaults to to False.

  • purge_readonly (bool) – Removes all fields that are defined as readonly in the normalization phase.

  • error_handler (class or instance based on BaseErrorHandler or tuple) – The error handler that formats the result of errors. When given as two-value tuple with an error-handler class and a dictionary, the latter is passed to the initialization of the error handler. Default: BasicErrorHandler.

checkers = ()
coercers = ()
default_setters = ()
normalization_rules = {'coerce': {'oneof': [{'type': 'callable'}, {'schema': {'oneof': [{'type': 'callable'}, {'allowed': (), 'type': 'string'}]}, 'type': 'list'}, {'allowed': (), 'type': 'string'}]}, 'default': {'nullable': True}, 'default_setter': {'oneof': [{'type': 'callable'}, {'allowed': (), 'type': 'string'}]}, 'purge_unknown': {'type': 'boolean'}, 'rename': {'type': 'hashable'}, 'rename_handler': {'oneof': [{'type': 'callable'}, {'schema': {'oneof': [{'type': 'callable'}, {'allowed': (), 'type': 'string'}]}, 'type': 'list'}, {'allowed': (), 'type': 'string'}]}}
rules = {'allof': {'logical': 'allof', 'type': 'list'}, 'allow_unknown': {'oneof': [{'type': 'boolean'}, {'check_with': 'bulk_schema', 'type': ['dict', 'string']}]}, 'allowed': {'type': 'container'}, 'anyof': {'logical': 'anyof', 'type': 'list'}, 'check_with': {'oneof': [{'type': 'callable'}, {'schema': {'oneof': [{'type': 'callable'}, {'allowed': (), 'type': 'string'}]}, 'type': 'list'}, {'allowed': (), 'type': 'string'}]}, 'coerce': {'oneof': [{'type': 'callable'}, {'schema': {'oneof': [{'type': 'callable'}, {'allowed': (), 'type': 'string'}]}, 'type': 'list'}, {'allowed': (), 'type': 'string'}]}, 'contains': {'empty': False}, 'default': {'nullable': True}, 'default_setter': {'oneof': [{'type': 'callable'}, {'allowed': (), 'type': 'string'}]}, 'dependencies': {'check_with': 'dependencies', 'type': ('dict', 'hashable', 'list')}, 'description': {'type': 'string'}, 'eallowed': {'type': 'list'}, 'empty': {'type': 'boolean'}, 'excludes': {'schema': {'type': 'hashable'}, 'type': ('hashable', 'list')}, 'forbidden': {'type': 'list'}, 'items': {'check_with': 'items', 'type': 'list'}, 'keysrules': {'check_with': 'bulk_schema', 'forbidden': ['rename', 'rename_handler'], 'type': ['dict', 'string']}, 'max': {'nullable': False}, 'maxlength': {'type': 'integer'}, 'meta': {}, 'min': {'nullable': False}, 'minlength': {'type': 'integer'}, 'noneof': {'logical': 'noneof', 'type': 'list'}, 'nullable': {'type': 'boolean'}, 'oneof': {'logical': 'oneof', 'type': 'list'}, 'purge_unknown': {'type': 'boolean'}, 'readonly': {'type': 'boolean'}, 'regex': {'type': 'string'}, 'rename': {'type': 'hashable'}, 'rename_handler': {'oneof': [{'type': 'callable'}, {'schema': {'oneof': [{'type': 'callable'}, {'allowed': (), 'type': 'string'}]}, 'type': 'list'}, {'allowed': (), 'type': 'string'}]}, 'require_all': {'type': 'boolean'}, 'required': {'type': 'boolean'}, 'schema': {'anyof': [{'check_with': 'schema'}, {'check_with': 'bulk_schema'}], 'type': ['dict', 'string']}, 'type': {'check_with': 'type', 'type': ['string', 'list']}, 'valuesrules': {'check_with': 'bulk_schema', 'forbidden': ['rename', 'rename_handler'], 'type': ['dict', 'string']}}
validation_rules = {'allof': {'logical': 'allof', 'type': 'list'}, 'allow_unknown': {'oneof': [{'type': 'boolean'}, {'check_with': 'bulk_schema', 'type': ['dict', 'string']}]}, 'allowed': {'type': 'container'}, 'anyof': {'logical': 'anyof', 'type': 'list'}, 'check_with': {'oneof': [{'type': 'callable'}, {'schema': {'oneof': [{'type': 'callable'}, {'allowed': (), 'type': 'string'}]}, 'type': 'list'}, {'allowed': (), 'type': 'string'}]}, 'contains': {'empty': False}, 'dependencies': {'check_with': 'dependencies', 'type': ('dict', 'hashable', 'list')}, 'description': {'type': 'string'}, 'eallowed': {'type': 'list'}, 'empty': {'type': 'boolean'}, 'excludes': {'schema': {'type': 'hashable'}, 'type': ('hashable', 'list')}, 'forbidden': {'type': 'list'}, 'items': {'check_with': 'items', 'type': 'list'}, 'keysrules': {'check_with': 'bulk_schema', 'forbidden': ['rename', 'rename_handler'], 'type': ['dict', 'string']}, 'max': {'nullable': False}, 'maxlength': {'type': 'integer'}, 'meta': {}, 'min': {'nullable': False}, 'minlength': {'type': 'integer'}, 'noneof': {'logical': 'noneof', 'type': 'list'}, 'nullable': {'type': 'boolean'}, 'oneof': {'logical': 'oneof', 'type': 'list'}, 'readonly': {'type': 'boolean'}, 'regex': {'type': 'string'}, 'require_all': {'type': 'boolean'}, 'required': {'type': 'boolean'}, 'schema': {'anyof': [{'check_with': 'schema'}, {'check_with': 'bulk_schema'}], 'type': ['dict', 'string']}, 'type': {'check_with': 'type', 'type': ['string', 'list']}, 'valuesrules': {'check_with': 'bulk_schema', 'forbidden': ['rename', 'rename_handler'], 'type': ['dict', 'string']}}
regolith.schemas.insert_alloweds(doc, alloweds, key)[source]
regolith.schemas.load_exemplars()[source]
regolith.schemas.load_schemas()[source]
regolith.schemas.validate(coll, record, schemas)[source]

Validate a record for a given db

Parameters:
  • coll (str) – The name of the db in question

  • record (dict) – The record to be validated

  • schemas (dict) – The schema to validate against

Returns:

  • rtn (bool) – True is valid

  • errors (dict) – The errors encountered (if any)

regolith.sorters module

Builder for websites.

regolith.sorters.category_val(document)[source]

Convert the category of a document into string of category info.

Parameters:

document (dict) – The dict of all corresponding categories for objects

Return type:

The string of the category item.

regolith.sorters.date_key(x)[source]
regolith.sorters.doc_date_key(document)[source]

Convert a dict of Datetime object to float serialization of date info.

Parameters:

document (dict) – the document that is expected to contain date-like information in “year” and/or “month”

Return type:

the float serialization of the date information in the document

regolith.sorters.doc_date_key_high(document)[source]

Convert a dict of highest Datetime object to float serialization of date info.

Parameters:

document (dict) – the document that is expected to contain date-like information in “end_year” and/or “end_month”

Return type:

the float serialization of the end date information in the document

regolith.sorters.ene_date_key(document)[source]

Convert a dict of ene Datetime object to float serialization of date info.

Parameters:

document (dict) – the document that is expected to contain date-like information in “end_year” and/or “end_month”

Return type:

the float serialization of the date information in the document

regolith.sorters.id_key(document)[source]

Convert the id-key of a document into a string

Parameters:

document (dict) – The document

Return type:

The string of the _id

regolith.sorters.level_val(document)[source]

Convert the level of a document into string of category info.

Parameters:

document (dict) – The document

Return type:

The string representing the level.

regolith.sorters.position_key(x)[source]

Sorts a people based on their position in the research group.

regolith.storage module

Tools for document storgage.

class regolith.storage.StorageClient(rc, store, path)[source]

Bases: object

Interface to the storage system

copydoc(doc)[source]

Copies file to the staging area.

retrieve(file_name)[source]

Get file from the store

Parameters:

file_name (name of the file)

Returns:

path – The path, if the file is not in the store None

Return type:

str or None

regolith.storage.copydocs(store, path, rc)[source]

Copies files to the staging area.

regolith.storage.find_store(rc)[source]
regolith.storage.main(rc)[source]

Copies files into the local storage location and uploads them.

regolith.storage.push(store, path)[source]

Pushes the local documents.

regolith.storage.push_git(store, path)[source]

Pushes the local documents via git.

regolith.storage.push_hg(store, path)[source]

Pushes the local documents via git.

regolith.storage.storage_path(store, rc)[source]

Computes the storage directory.

regolith.storage.store_client(rc)[source]

Context manager for file storage

Parameters:

rc (RunControl)

Yields:

client (StorageClient) – The StorageClient instance

regolith.storage.sync(store, path)[source]

Syncs the local documents.

regolith.storage.sync_git(store, path)[source]

Syncs the local documents via git.

regolith.storage.sync_hg(store, path)[source]

Syncs the local documents via hg.

regolith.stylers module

A collection of python stylers

regolith.stylers.sentencecase(sentence)[source]

returns a sentence in sentencecase but with text in braces preserved

Parameters:

sentence (str) – The sentence

Return type:

The sentence in sentence-case (but preserving any text wrapped in braces)

Notes

tbd or n/a are returned lower case, not sentence case.

regolith.tools module

Misc. regolith tools.

regolith.tools.add_to_google_calendar(event)[source]

Takes a newly created event, and adds it to the user’s google calendar

Parameters:
event - a dictionary containing the event details to be added to google calendar

https://developers.google.com/calendar/api/v3/reference/events

Returns:

None

regolith.tools.all_docs_from_collection(client, collname, copy=True)[source]

Yield all entries in all collections of a given name in a given database.

regolith.tools.awards(p, since=None, before=None)[source]

Make sorted awards and honors

Parameters:
  • p (dict) – The person entry

  • since (date. Optional, default is None) – The begin date to filter from

  • before (date. Optional, default is None) – The end date to filter for. None does not apply this filter

regolith.tools.awards_grants_honors(person, target_name, funding=True, service_types=None)[source]

Make sorted awards grants and honors list.

Parameters:

person (dict) – The person entry

regolith.tools.collect_appts(ppl_coll, filter_key=None, filter_value=None, begin_date=None, end_date=None)[source]

Retrieves a list of all the appointments on the given grant(s) in the given interval of time for each person in the given people collection.

Parameters:
  • ppl_coll (collection (list of dicts)) – The people collection containing persons with appointments

  • filter_key (string, list, optional) – The key we want to filter appointments by

  • filter_value (string, int, float, list, optional) – The values for each key that we want to filter appointments by

  • begin_date (string, datetime, optional) – The start date for the interval in which we want to collect appointments

  • end_date (string, datetime, optional) – The start date for the interval in which we want to collect appointments

Returns:

a list of all appointments in the people collection that satisfy the provided conditions (if any)

Return type:

list

Examples

>>> collect_appts(people,filter_key=['grant', 'status'], filter_value=['mrsec14', 'finalized'],     begin_date= '2020-09-01', end_date='2020-12-31')
This would return all appointments on the grant 'mrsec14' with status 'finalized' that are valid on/during any
dates from 2020-09-01 to 2020-12-31
>>> collect_appts(people, filter_key=['grant', 'grant'], filter_value=['mrsec14', 'dmref19'])
This would return all appointments on the grants 'mrsec14' and 'dmref19' irrespective of their dates.
regolith.tools.collection_str(collection, keys=None)[source]

Retrieves a list of all documents from the collection where the fragment appears in any one of the given fields

Parameters:
  • collection (generator) – The collection containing the documents

  • keys (list, optional) – The name of the fields to return from the search. Defaults to none in which case only the id is returned

Returns:

A str of all the values

Return type:

str

regolith.tools.compound_dict(doc, li)[source]

Recursive function that collects all the strings from a document that is a dictionary

Parameters:
  • doc dict – The specific document we are traversing

  • li – The recursive list that holds all the strings

Returns:

The strings that make up the nested attributes of this object

Return type:

list of strings

regolith.tools.compound_list(doc, li)[source]

Recursive function that collects all the strings from a document that is a list

Parameters:
  • doc list – The specific document we are traversing

  • li – The recursive list that holds all the strings

Returns:

The strings that make up the nested attributes of this list

Return type:

list of strings

regolith.tools.create_repo(destination_id, token_info_id, rc)[source]

Creates a repo at the target distination

tries to fail gracefully if repo information and token is not defined

Parameters:
destination_id - string

the id of the target repo information document

token_info_id - string

the id for the token info document (e.g. ‘priv_token’)

rc - run control object

the run control object that should contain rc.repos and rc.tokens docs

Returns:

Success message (repo target_repo has been created in talks) if repo is successfully created in target_repo Warning/setup messages if unsuccessful (or if repo info or token are not valid)

regolith.tools.date_to_rfc822(y, m, d=1)[source]

Converts a date to an RFC 822 formatted string.

regolith.tools.dbdirname(db, rc)[source]

Gets the database dir name.

regolith.tools.dbpathname(db, rc)[source]

Gets the database path name.

regolith.tools.dereference_institution(input_record, institutions, verbose=False)[source]

Tool for replacing placeholders for institutions with the actual institution data. Note that the replacement is done inplace

Parameters:
  • input_record (dict) – The record to dereference

  • institutions (iterable of dicts) – The institutions

Return type:

nothing

regolith.tools.document_by_value(documents, address, value)[source]

Get a specific document by one of its values

Parameters:
  • documents (generator) – Generator which yields the documents

  • address (str or tuple) – The address of the data in the document

  • value (any) – The expected value for the document

Returns:

The first document which matches the request

Return type:

dict

regolith.tools.fallback(cond, backup)[source]

Decorator for returning the object if cond is true and a backup if cond is false.

regolith.tools.filter_activities(people, begin_period, type, verbose=False)[source]
regolith.tools.filter_committees(person, begin_period, type)[source]
regolith.tools.filter_employment_for_advisees(peoplecoll, begin_period, status, advisor, now=None)[source]

Filter people to get advisees since begin_period

Parameters:
  • people (list of dicts) – The people collection

  • begin_period (date) – Only select advisees who were active after this date (i.e., their end date is after begin_period

  • status (str) – the status of the person in the group to filter for, e.g., ms, phd, postdoc

regolith.tools.filter_facilities(people, begin_period, type, verbose=False)[source]
regolith.tools.filter_grants(input_grants, names, pi=True, reverse=True, multi_pi=False)[source]

Filter grants by those involved

Parameters:
  • input_grants (list of dict) – The grants to filter

  • names (set of str) – The authors to be filtered against

  • pi (bool, optional) – If True add the grant amount to that person’s total amount

  • reverse (bool, optional) – If True reverse the order, defaults to False

  • multi_pi (bool, optional) – If True compute sub-awards for multi PI grants, defaults to False

regolith.tools.filter_licenses(patentscoll, people, target, since=None, before=None)[source]
regolith.tools.filter_patents(patentscoll, people, target, since=None, before=None)[source]
regolith.tools.filter_presentations(people, presentations, institutions, target, types=None, since=None, before=None, statuses=None)[source]
regolith.tools.filter_projects(projects, people, reverse=False, active_only=False, group=None, ptype=None)[source]

Filter projects by the author(s)

Parameters:
  • projects (list of dict) – The publication citations

  • people (set of list of str) – The people to be filtered against

  • reverse (bool, optional) – If True reverse the order, defaults to False

  • since (date, optional) – The date after which a highlight must be for a project to be returned, defaults to None

  • before (date, optional) – The date before which a highlight must be for a project to be returned, defaults to None

  • active_only (bool, optional) – Only active projects will be returned if True, defaults to False

  • group (str, optional) – Only projects from this group will be returned if specified, otherwise projects from all groups will be returned, defaults to None

  • ptype (str, optional) – The type of the project to filter for, such as ossoftware for open source software, defaults to None

regolith.tools.filter_publications(citations, authors, reverse=False, bold=True, since=None, before=None, ackno=False, grants=None, facilities=None)[source]

Filter publications by the author(s)/editor(s)

Parameters:
  • citations (list of dict) – The publication citations

  • authors (set of str) – The authors to be filtered against

  • reverse (bool, optional) – If True reverse the order, defaults to False

  • bold (bool, optional) – If True put latex bold around the author(s) in question

  • since (date, optional) – The date after which papers must have been published

  • before (date, optional) – The date before which papers must have been published

  • ackno (bool) – Move the acknowledgement statement to note so that it is displayed in the publication list

  • grants (string or list of strings, optional) – The grant or grants to filter over

  • facilities (string, optional) – The facilities to filter over

regolith.tools.filter_service(p, begin_period, type)[source]
regolith.tools.fragment_retrieval(coll, fields, fragment, case_sensitive=False)[source]

Retrieves a list of all documents from the collection where the fragment appears in any one of the given fields

Parameters:
  • coll (generator) – The collection containing the documents

  • fields (iterable) – The fields of each document to check for the fragment

  • fragment – The value to compare against to find the documents of interest

  • case_sensitive (Bool) – When true will match case (Default = False)

Returns:

A list of documents (that are dicts)

Return type:

list

Examples

>>> fragment_retrieval(people, ['aka', 'name'], 'pi_name', case_sensitive = False)

This would get all people for which either the alias or the name included the substring pi_name.

regolith.tools.fuzzy_retrieval(documents, sources, value, case_sensitive=True)[source]

Retrieve a document from the documents where value is compared against multiple potential sources

Parameters:
  • documents (generator) – The documents

  • sources (iterable) – The potential data sources

  • value – The value to compare against to find the document of interest

  • case_sensitive (Bool) – When true will match case (Default = True)

Returns:

The document

Return type:

dict

Examples

>>> fuzzy_retrieval(people, ['aka', 'name'], 'pi_name', case_sensitive = False)

This would get the person entry for which either the alias or the name was pi_name.

regolith.tools.get_appointments(person, appointments, target_grant=None)[source]

get appointments from a person from the people collection

Parameters:
  • person (dict) – The person from whom to harvest appointments from

  • appointments (list of tuples) – The list of appointments. Each tuple contains (person_id, begin-date, end-date, loading (a number between 0 and 1), and weighted duration (i.e., actual duration in months * loading) in units of months

  • target_grant (str) – optional. id of grant for which you want to search for appointments. If not specified it will return appointments for that person in that date range for all/any grants

Return type:

updated appointments list

regolith.tools.get_formatted_crossref_reference(doi)[source]

given a doi, return the full reference and the date of the reference from Crossref REST-API

Parameters:

doi str – the doi of the digital object to pull from Crossref

Returns:

  • ref str – the nicely formatted reference including title

  • ref_date datetime.date – the date of the reference

  • returns None None in the article cannot be found given the doi

regolith.tools.get_id_from_name(coll, name)[source]
regolith.tools.get_person(person_id, rc)[source]

Get the person’s name.

regolith.tools.get_person_contact(name, people_coll, contacts_coll)[source]

Return a person document if found in either people or contacts collections

If the person is found in the people collection this person is returned. If not found in people but found in contacts, the person found in contacts is returned. If the person is not found in either collection, None is returned

Parameters:
  • name (str) – The name or id of the person to look for

  • people_coll (collection (list of dicts)) – The people collection

  • contacts_coll (collection (list of dicts)) – The contacts collection

Returns:

person – The found person document

Return type:

dict

regolith.tools.get_pi_id(rc)[source]

Gets the database id of the group PI

Parameters:

rc (runcontrol object) – The runcontrol object. It must contain the ‘groups’ and ‘people’ collections in the needed databases

Return type:

The database ‘_id’ of the group PI

regolith.tools.get_tags(coll)[source]

Given a collection with a tags field, returns the set of tags as a list

The tags field is expected to be a string with comma or space separated tags. get_tags splits the tags and returns the set of unique tags as a list of strings.

Parameters:

coll collection – the collection

Return type:

the set of all tags as a list

regolith.tools.get_target_repo_info(target_repo_id, repos)[source]

checks if repo information is defined and valid in rc

Parameters:
target_repo_id - string

the id of the doc with the target repo information

repos - list
the list of repos. A repo must have a name, a url and a params

kwarg.

Returns:

The target repo document, or False if it is not present or properly formulatedinformation

regolith.tools.get_target_token(target_token_id, tokens)[source]

Checks if API authentication token is defined and valid in rc

Parameters:
target_token_id - string

the name of the personal access token (defined in rc)

rc - run control object

Returns:

The token if the token exists and False if not

regolith.tools.get_team_from_grant(grantcol)[source]
regolith.tools.get_uuid()[source]

returns a uuid.uuid4 string

regolith.tools.gets(seq, key, default=None)[source]

Gets a key from every element of a sequence if possible.

regolith.tools.google_cal_auth_flow()[source]

First time authentication, this function opens a window to request user consent to use google calendar API, and then returns a token

regolith.tools.grant_burn(grant, appts, begin_date=None, end_date=None)[source]

Retrieves the total burn of a grant over an interval of time by integrating over all appointments made on the grant.

Parameters:
  • grant (dict) – The grant object whose burn needs to be retrieved

  • appts (collection (list of dicts), dict) – The collection of appointments made on assorted grants

  • begin_date (datetime, string, optional) – The start date of the interval of time to retrieve the grant burn for, either a date object or a string in YYYY-MM-DD format. Defaults to the begin_date of the grant.

  • end_date (datetime, string, optional) – The end date of the interval of time to retrieve the grant burn for, either a date object or a string in YYYY-MM-DD format. Defaults to the end_date of the grant.

Returns:

A dictionaries whose keys are the dates and values are a dict containing the corresponding grant amounts on that date

Return type:

dict

Examples

>>> grant_burn(mygrant, myappts, begin_date="2020-09-01", end_date="2020-09-03")
returns
>>> {datetime.date(2020, 9, 1): {'student_days': 5.0, 'postdoc_days': 12.0, 'ss_days': 20.0},          datetime.date(2020, 9, 2): {'student_days': 4.0, 'postdoc_days': 11.5, 'ss_days': 15.0},          datetime.date(2020, 9, 3): {'student_days': 3.0, 'postdoc_days': 11.0, 'ss_days': 10.0}}
regolith.tools.group(db, by)[source]

Group the document in the database according to the value of the doc[by] in db.

Parameters:
  • db (iterable) – The database of documents.

  • by (basestring) – The key to group the documents.

Returns:

grouped – A dictionary mapping the feature value of group to the list of docs. All docs in the same generator have the same value of doc[by].

Return type:

dict

Examples

Here, we use a tuple of dict as an example of the database. >>> db = ({“k”: “v0”}, {“k”: “v1”}, {“k”: “v0”}) >>> group(db) This will return >>> {“v0”: [{“k”: “v0”}, {“k”: “v0”}], “v1”: [{“k”: “v1”}]}

regolith.tools.group_member_employment_start_end(person, grpname)[source]

Get start and end dates of group member employment

Parameters:
  • person dict – The person whose dates we want

  • grpname – The code for the group we want the dates of employment from

Returns:

The employment periods, with person id, begin and end dates

Return type:

list of dicts

regolith.tools.group_member_ids(ppl_coll, grpname)[source]

Get a list of all group member ids

Parameters:
  • ppl_coll (collection (list of dicts)) – The people collection that should contain the group members

  • grp (string) – The id of the group in groups.yml

Returns:

The set of ids of the people in the group

Return type:

set

Notes

  • Groups that are being tracked are listed in the groups.yml collection

with a name and an id. - People are in a group during an educational or employment period. - To assign a person to a tracked group during one such period, add a “group” key to that education/employment item with a value that is the group id. - This function takes the group id that is passed and searches the people collection for all people that have been assigned to that group in some period of time and returns a list of

regolith.tools.is_fully_appointed(person, begin_date, end_date)[source]

Checks if a collection of appointments for a person is valid and fully loaded for a given interval of time

Parameters:
  • person (dict) – The person whose appointments need to be checked

  • begin_date (datetime, string, optional) – The start date of the interval of time to check appointments for

  • end_date (datetime, string, optional) – The end date of the interval of time to check appointments for

Returns:

True if the person is fully appointed and False if not

Return type:

bool

Examples

>>> appts = [{"begin_year": 2017, "begin_month": 6, "begin_day": 1, "end_year": 2017,         "end_month": 6, "end_day": 15, "grant": "grant1", "loading": 1.0, "type": "pd", },        {"begin_year": 2017, "begin_month": 6, "begin_day": 20,  "end_year": 2017,  "end_month": 6,         "end_day": 30, "grant": "grant2", "loading": 1.0, "type": "pd",} ]
>>> aejaz = {"name": "Adiba Ejaz", "_id": "aejaz", "appointments": appts}
>>> is_fully_appointed(aejaz, "2017-06-01", "2017-06-30")

In this case, we have an invalid loading from 2017-06-16 to 2017-06-19 hence it would return False and print “appointment gap for aejaz from 2017-06-16 to 2017-06-19”.

regolith.tools.key_value_pair_filter(collection, arguments)[source]

Retrieves a list of all documents from the collection where the fragment appears in any one of the given fields

Parameters:
  • collection (generator) – The collection containing the documents

  • arguments (list) – The name of the fields to look for and their accompanying substring

Returns:

The collection containing the elements that satisfy the search criteria

Return type:

generator

Examples

>>> key_value_pair_filter(people, ['name', 'ab', 'position', 'professor'])

This would get all people for which their name contains the string ‘ab’ and whose position is professor and return them

regolith.tools.latex_safe(s, url_check=True, wrapper='url')[source]

Make string latex safe

Parameters:
  • s (str)

  • url_check (bool, optional) – If True check for URLs and wrap them, if False check for URL but don’t wrap, defaults to True

  • wrapper (str, optional) – The wrapper for wrapping urls defaults to url

regolith.tools.latex_safe_url(s)[source]

Makes a string that is a URL latex safe.

regolith.tools.make_bibtex_file(pubs, pid, person_dir='.')[source]

Make a bibtex file given the publications

Parameters:
  • pubs (list of dict) – The publications

  • pid (str) – The person id

  • person_dir (str, optional) – The person’s directory

regolith.tools.merge_collections_all(a, b, target_id)[source]

merge two collections into a single merged collection

for keys that are in both collections, the value in b will be kept

Parameters:
  • a the inferior collection (will lose values of shared keys)

  • b the superior collection (will keep values of shared keys)

  • target_id str the name of the key used in b to dereference ids in a

Returns:

  • the combined collection. Note that it returns a collection containing

  • all items from a and b with the items dereferenced in b merged with the

  • dereferenced items in a.

  • see also merge_intersection that returns collection that is just referenced

  • in both

Examples

>>>  grants = merge_collections_all(self.gtx["proposals"], self.gtx["grants"], "proposal_id")

This would merge all entries in the proposals collection with entries in the grants collection for which “_id” in proposals has the value of “proposal_id” in grants, returning also unchanged any other entries that are not linked.

regolith.tools.merge_collections_intersect(a, b, target_id)[source]

merge two collections such thta just the intersection is returned

for shared keys that are in both collections, the value in b will be kept

Parameters:
  • a the inferior collection (will lose values of shared keys)

  • b the superior collection (will keep values of shared keys)

  • target_id str the name of the key used in b to dereference ids in a

Returns:

  • the combined collection. Note that it returns a collection only containing

  • merged items from a and b that are dereferenced in b, i.e., the merged

  • intercept.

  • see also merge_collections_all that returns all items in a, b and the intersect

  • and merge_collections_superior that returns all items in b and the intercept

Examples

>>>  grants = merge_collections_intesect(self.gtx["proposals"], self.gtx["grants"], "proposal_id")

This would merge all entries in the proposals collection with entries in the grants collection for which “_id” in proposals has the value of “proposal_id” in grants, returning just those items that have the dereference

regolith.tools.merge_collections_superior(a, b, target_id)[source]

merge two collections into a single merged collection

for keys that are in both collections, the value in b will be kept

Parameters:
  • a the inferior collection (will lose values of shared keys)

  • b the superior collection (will keep values of shared keys)

  • target_id str the name of the key used in b to dereference ids in a

Returns:

  • the combined collection. Note that it returns a collection containing

  • all items from a and b with the items dereferenced in b merged with the

  • dereferenced items in a.

  • see also merge_intersection that returns collection that is just referenced

  • in both

Examples

>>>  grants = merge_collections_all(self.gtx["proposals"], self.gtx["grants"], "proposal_id")

This would merge all entries in the proposals collection with entries in the grants collection for which “_id” in proposals has the value of “proposal_id” in grants, returning also unchanged any other entries that are not linked.

regolith.tools.month_and_year(m=None, y=None)[source]

Creates a string from month and year data, if available.

regolith.tools.number_suffix(number)[source]

returns the suffix that adjectivises a number (st, nd, rd, th)

Paramters

number: integer

The number. If number is not an integer, returns an empty string

returns:

suffix – The suffix (st, nd, rd, th)

rtype:

string

regolith.tools.print_task(task_list, stati, index=True)[source]

Print tasks in a nice format.

Parameters:
  • task_list (list) – A list of tasks that will be printed.

  • stati (list) – Filter status of the task

regolith.tools.remove_duplicate_docs(coll, key)[source]

find all docs where the target key has the same value and remove duplicates

The doc found first will be kept and subsequent docs will be removed

Parameters:
  • target iterable of dicts – the list of documents

  • key string – the key that will be used to compare

Return type:

The list of docs with duplicates (as described above) removed

regolith.tools.rfc822now()[source]

Creates a string of the current time according to RFC 822.

regolith.tools.search_collection(collection, arguments, keys=None)[source]

Retrieves a list of all documents from the collection where the fragment appears in any one of the given fields

Parameters:
  • collection (generator) – The collection containing the documents

  • arguments (list) – The name of the fields to look for and their accompanying substring

  • keys (list, optional) – The name of the fields to return from the search. Defaults to none in which case only the id is returned

Returns:

The collection containing the elements that satisfy the search criteria

Return type:

generator

Examples

>>> search_collection(people, ['name', 'ab', 'position', 'professor'], ['_id', 'name'])

This would get all people for which their name contains the string ‘ab’ and whose position is professor. It would return the name and id of the valid entries

regolith.tools.update_schemas(default_schema, user_schema)[source]

Merging the user schema into the default schema recursively and return the merged schema. The default schema and user schema will not be modified during the merging.

Parameters:
  • default_schema (dict) – The default schema.

  • user_schema (dict) – The user defined schema.

Returns:

updated_schema – The merged schema.

Return type:

dict

regolith.tools.validate_doc(collection_name, doc, rc)[source]
regolith.tools.validate_meeting(meeting, date)[source]

Validates a meeting by checking is it has a journal club doi, a presentation link, and a presentation title. This function will return nothing is the meeting is valid, otherwise it will raise a ValueError.

Parameters:
  • meeting (dict) – The meeting object that needs to be validated

  • date (datetime object) – The date we want to use to see if a meeting has happened or not

regolith.validators module

Validators and converters for regolith input.

regolith.validators.always_false(x)[source]

Returns False

regolith.validators.always_true(x)[source]

Returns True

regolith.validators.ensure_database(db)[source]
regolith.validators.ensure_databases(dbs)[source]

Ensures each dataset in a list of databases

regolith.validators.ensure_email(email)[source]

Ensures the email top-level key is well formed.

regolith.validators.ensure_store(store)[source]
regolith.validators.ensure_stores(stores)[source]

Ensures each store in a list of stores

regolith.validators.ensure_string(x)[source]

Returns a string if x is not a string, and x if it already is.

regolith.validators.is_bool(x)[source]

Tests if something is a boolean

regolith.validators.is_int(x)[source]

Tests if something is an integer

regolith.validators.is_string(x)[source]

Tests if something is a string

regolith.validators.noop(x)[source]

Does nothing, just returns the input.

regolith.validators.to_bool(x)[source]

“Converts to a boolean in a semantically meaningful way.