Tools (regolith.tools
)¶
Misc. regolith tools.
- regolith.tools.add_to_google_calendar(event)[source]¶
Takes a newly created event, and adds it to the user’s google calendar
- Parameters:
- event - a dictionary containing the event details to be added to google calendar
https://developers.google.com/calendar/api/v3/reference/events
- Returns:
None
- regolith.tools.all_docs_from_collection(client, collname, copy=True)[source]¶
Yield all entries in all collections of a given name in a given database.
- regolith.tools.awards(p, since=None, before=None)[source]¶
Make sorted awards and honors
- Parameters:
p (dict) – The person entry
since (date. Optional, default is None) – The begin date to filter from
before (date. Optional, default is None) – The end date to filter for. None does not apply this filter
- regolith.tools.awards_grants_honors(p, target_name, funding=True, service_types=None)[source]¶
Make sorted awards grants and honors list.
- Parameters:
p (dict) – The person entry
- regolith.tools.collect_appts(ppl_coll, filter_key=None, filter_value=None, begin_date=None, end_date=None)[source]¶
Retrieves a list of all the appointments on the given grant(s) in the given interval of time for each person in the given people collection.
- Parameters:
ppl_coll (collection (list of dicts)) – The people collection containing persons with appointments
filter_key (string, list, optional) – The key we want to filter appointments by
filter_value (string, int, float, list, optional) – The values for each key that we want to filter appointments by
begin_date (string, datetime, optional) – The start date for the interval in which we want to collect appointments
end_date (string, datetime, optional) – The start date for the interval in which we want to collect appointments
- Returns:
a list of all appointments in the people collection that satisfy the provided conditions (if any)
- Return type:
list
Examples
>>> collect_appts(people,filter_key=['grant', 'status'], filter_value=['mrsec14', 'finalized'], begin_date= '2020-09-01', end_date='2020-12-31') This would return all appointments on the grant 'mrsec14' with status 'finalized' that are valid on/during any dates from 2020-09-01 to 2020-12-31 >>> collect_appts(people, filter_key=['grant', 'grant'], filter_value=['mrsec14', 'dmref19']) This would return all appointments on the grants 'mrsec14' and 'dmref19' irrespective of their dates.
- regolith.tools.collection_str(collection, keys=None)[source]¶
Retrieves a list of all documents from the collection where the fragment appears in any one of the given fields
- Parameters:
collection (generator) – The collection containing the documents
keys (list, optional) – The name of the fields to return from the search. Defaults to none in which case only the id is returned
- Returns:
A str of all the values
- Return type:
str
- regolith.tools.compound_dict(doc, li)[source]¶
Recursive function that collects all the strings from a document that is a dictionary
- Parameters:
doc dict – The specific document we are traversing
li – The recursive list that holds all the strings
- Returns:
The strings that make up the nested attributes of this object
- Return type:
list of strings
- regolith.tools.compound_list(doc, li)[source]¶
Recursive function that collects all the strings from a document that is a list
- Parameters:
doc list – The specific document we are traversing
li – The recursive list that holds all the strings
- Returns:
The strings that make up the nested attributes of this list
- Return type:
list of strings
- regolith.tools.create_repo(destination_id, token_info_id, rc)[source]¶
Creates a repo at the target distination
tries to fail gracefully if repo information and token is not defined
- Parameters:
- destination_id - string
the id of the target repo information document
- token_info_id - string
the id for the token info document (e.g. ‘priv_token’)
- rc - run control object
the run control object that should contain rc.repos and rc.tokens docs
- Returns:
Success message (repo target_repo has been created in talks) if repo is successfully created in target_repo Warning/setup messages if unsuccessful (or if repo info or token are not valid)
- regolith.tools.dereference_institution(input_record, institutions, verbose=False)[source]¶
Tool for replacing placeholders for institutions with the actual institution data. Note that the replacement is done inplace
- Parameters:
input_record (dict) – The record to dereference
institutions (iterable of dicts) – The institutions
- Return type:
nothing
- regolith.tools.document_by_value(documents, address, value)[source]¶
Get a specific document by one of its values
- Parameters:
documents (generator) – Generator which yields the documents
address (str or tuple) – The address of the data in the document
value (any) – The expected value for the document
- Returns:
The first document which matches the request
- Return type:
dict
- regolith.tools.fallback(cond, backup)[source]¶
Decorator for returning the object if cond is true and a backup if cond is false.
- regolith.tools.filter_employment_for_advisees(peoplecoll, begin_period, status, advisor, now=None)[source]¶
Filter people to get advisees since begin_period
- Parameters:
people (list of dicts) – The people collection
begin_period (date) – Only select advisees who were active after this date (i.e., their end date is after begin_period
status (str) – the status of the person in the group to filter for, e.g., ms, phd, postdoc
- regolith.tools.filter_grants(input_grants, names, pi=True, reverse=True, multi_pi=False)[source]¶
Filter grants by those involved
- Parameters:
input_grants (list of dict) – The grants to filter
names (set of str) – The authors to be filtered against
pi (bool, optional) – If True add the grant amount to that person’s total amount
reverse (bool, optional) – If True reverse the order, defaults to False
multi_pi (bool, optional) – If True compute sub-awards for multi PI grants, defaults to False
- regolith.tools.filter_presentations(people, presentations, institutions, target, types=None, since=None, before=None, statuses=None)[source]¶
- regolith.tools.filter_projects(projects, people, reverse=False, active_only=False, group=None, ptype=None)[source]¶
Filter projects by the author(s)
- Parameters:
projects (list of dict) – The publication citations
people (set of list of str) – The people to be filtered against
reverse (bool, optional) – If True reverse the order, defaults to False
since (date, optional) – The date after which a highlight must be for a project to be returned, defaults to None
before (date, optional) – The date before which a highlight must be for a project to be returned, defaults to None
active_only (bool, optional) – Only active projects will be returned if True, defaults to False
group (str, optional) – Only projects from this group will be returned if specified, otherwise projects from all groups will be returned, defaults to None
ptype (str, optional) – The type of the project to filter for, such as ossoftware for open source software, defaults to None
- regolith.tools.filter_publications(citations, authors, reverse=False, bold=True, since=None, before=None, ackno=False, grants=None)[source]¶
Filter publications by the author(s)/editor(s)
- Parameters:
citations (list of dict) – The publication citations
authors (set of str) – The authors to be filtered against
reverse (bool, optional) – If True reverse the order, defaults to False
bold (bool, optional) – If True put latex bold around the author(s) in question
since (date, optional) – The date after which papers must have been published
before (date, optional) – The date before which papers must have been published
ackno (bool) – Move the acknowledgement statement to note so that it is displayed in the publication list
grants (string or list of strings, optional) – The grant or grants to filter over
- regolith.tools.fragment_retrieval(coll, fields, fragment, case_sensitive=False)[source]¶
Retrieves a list of all documents from the collection where the fragment appears in any one of the given fields
- Parameters:
coll (generator) – The collection containing the documents
fields (iterable) – The fields of each document to check for the fragment
fragment – The value to compare against to find the documents of interest
case_sensitive (Bool) – When true will match case (Default = False)
- Returns:
A list of documents (that are dicts)
- Return type:
list
Examples
>>> fragment_retrieval(people, ['aka', 'name'], 'pi_name', case_sensitive = False)
This would get all people for which either the alias or the name included the substring
pi_name
.
- regolith.tools.fuzzy_retrieval(documents, sources, value, case_sensitive=True)[source]¶
Retrieve a document from the documents where value is compared against multiple potential sources
- Parameters:
documents (generator) – The documents
sources (iterable) – The potential data sources
value – The value to compare against to find the document of interest
case_sensitive (Bool) – When true will match case (Default = True)
- Returns:
The document
- Return type:
dict
Examples
>>> fuzzy_retrieval(people, ['aka', 'name'], 'pi_name', case_sensitive = False)
This would get the person entry for which either the alias or the name was
pi_name
.
- regolith.tools.get_formatted_crossref_reference(doi)[source]¶
given a doi, return the full reference and the date of the reference from Crossref REST-API
- Parameters:
doi str – the doi of the digital object to pull from Crossref
- Returns:
ref str – the nicely formatted reference including title
ref_date datetime.date – the date of the reference
returns None None in the article cannot be found given the doi
- regolith.tools.get_person_contact(name, people_coll, contacts_coll)[source]¶
Return a person document if found in either people or contacts collections
If the person is found in the people collection this person is returned. If not found in people but found in contacts, the person found in contacts is returned. If the person is not found in either collection, None is returned
- Parameters:
name (str) – The name or id of the person to look for
people_coll (collection (list of dicts)) – The people collection
contacts_coll (collection (list of dicts)) – The contacts collection
- Returns:
person – The found person document
- Return type:
dict
- regolith.tools.get_pi_id(rc)[source]¶
Gets the database id of the group PI
- Parameters:
rc (runcontrol object) – The runcontrol object. It must contain the ‘groups’ and ‘people’ collections in the needed databases
- Return type:
The database ‘_id’ of the group PI
- regolith.tools.get_tags(coll)[source]¶
Given a collection with a tags field, returns the set of tags as a list
The tags field is expected to be a string with comma or space separated tags. get_tags splits the tags and returns the set of unique tags as a list of strings.
- Parameters:
coll collection – the collection
- Return type:
the set of all tags as a list
- regolith.tools.get_target_repo_info(target_repo_id, repos)[source]¶
checks if repo information is defined and valid in rc
- Parameters:
- target_repo_id - string
the id of the doc with the target repo information
- repos - list
- the list of repos. A repo must have a name, a url and a params
kwarg.
- Returns:
The target repo document, or False if it is not present or properly formulatedinformation
- regolith.tools.get_target_token(target_token_id, tokens)[source]¶
Checks if API authentication token is defined and valid in rc
- Parameters:
- target_token_id - string
the name of the personal access token (defined in rc)
rc - run control object
- Returns:
The token if the token exists and False if not
- regolith.tools.gets(seq, key, default=None)[source]¶
Gets a key from every element of a sequence if possible.
- regolith.tools.google_cal_auth_flow()[source]¶
First time authentication, this function opens a window to request user consent to use google calendar API, and then returns a token
- regolith.tools.grant_burn(grant, appts, begin_date=None, end_date=None)[source]¶
Retrieves the total burn of a grant over an interval of time by integrating over all appointments made on the grant.
- Parameters:
grant (dict) – The grant object whose burn needs to be retrieved
appts (collection (list of dicts), dict) – The collection of appointments made on assorted grants
begin_date (datetime, string, optional) – The start date of the interval of time to retrieve the grant burn for, either a date object or a string in YYYY-MM-DD format. Defaults to the begin_date of the grant.
end_date (datetime, string, optional) – The end date of the interval of time to retrieve the grant burn for, either a date object or a string in YYYY-MM-DD format. Defaults to the end_date of the grant.
- Returns:
A dictionaries whose keys are the dates and values are a dict containing the corresponding grant amounts on that date
- Return type:
dict
Examples
>>> grant_burn(mygrant, myappts, begin_date="2020-09-01", end_date="2020-09-03") returns >>> {datetime.date(2020, 9, 1): {'student_days': 5.0, 'postdoc_days': 12.0, 'ss_days': 20.0}, datetime.date(2020, 9, 2): {'student_days': 4.0, 'postdoc_days': 11.5, 'ss_days': 15.0}, datetime.date(2020, 9, 3): {'student_days': 3.0, 'postdoc_days': 11.0, 'ss_days': 10.0}}
- regolith.tools.group(db, by)[source]¶
Group the document in the database according to the value of the doc[by] in db.
- Parameters:
db (iterable) – The database of documents.
by (basestring) – The key to group the documents.
- Returns:
grouped – A dictionary mapping the feature value of group to the list of docs. All docs in the same generator have the same value of doc[by].
- Return type:
dict
Examples
Here, we use a tuple of dict as an example of the database. >>> db = ({“k”: “v0”}, {“k”: “v1”}, {“k”: “v0”}) >>> group(db) This will return >>> {“v0”: [{“k”: “v0”}, {“k”: “v0”}], “v1”: [{“k”: “v1”}]}
- regolith.tools.group_member_employment_start_end(person, grpname)[source]¶
Get start and end dates of group member employment
- Parameters:
person dict – The person whose dates we want
grpname – The code for the group we want the dates of employment from
- Returns:
The employment periods, with person id, begin and end dates
- Return type:
list of dicts
- regolith.tools.group_member_ids(ppl_coll, grpname)[source]¶
Get a list of all group member ids
- Parameters:
ppl_coll (collection (list of dicts)) – The people collection that should contain the group members
grp (string) – The id of the group in groups.yml
- Returns:
The set of ids of the people in the group
- Return type:
set
Notes
Groups that are being tracked are listed in the groups.yml collection
with a name and an id. - People are in a group during an educational or employment period. - To assign a person to a tracked group during one such period, add a “group” key to that education/employment item with a value that is the group id. - This function takes the group id that is passed and searches the people collection for all people that have been assigned to that group in some period of time and returns a list of
- regolith.tools.is_fully_appointed(person, begin_date, end_date)[source]¶
Checks if a collection of appointments for a person is valid and fully loaded for a given interval of time
- Parameters:
person (dict) – The person whose appointments need to be checked
begin_date (datetime, string, optional) – The start date of the interval of time to check appointments for
end_date (datetime, string, optional) – The end date of the interval of time to check appointments for
- Returns:
True if the person is fully appointed and False if not
- Return type:
bool
Examples
>>> appts = [{"begin_year": 2017, "begin_month": 6, "begin_day": 1, "end_year": 2017, "end_month": 6, "end_day": 15, "grant": "grant1", "loading": 1.0, "type": "pd", }, {"begin_year": 2017, "begin_month": 6, "begin_day": 20, "end_year": 2017, "end_month": 6, "end_day": 30, "grant": "grant2", "loading": 1.0, "type": "pd",} ] >>> aejaz = {"name": "Adiba Ejaz", "_id": "aejaz", "appointments": appts} >>> is_fully_appointed(aejaz, "2017-06-01", "2017-06-30")
In this case, we have an invalid loading from 2017-06-16 to 2017-06-19 hence it would return False and print “appointment gap for aejaz from 2017-06-16 to 2017-06-19”.
- regolith.tools.key_value_pair_filter(collection, arguments)[source]¶
Retrieves a list of all documents from the collection where the fragment appears in any one of the given fields
- Parameters:
collection (generator) – The collection containing the documents
arguments (list) – The name of the fields to look for and their accompanying substring
- Returns:
The collection containing the elements that satisfy the search criteria
- Return type:
generator
Examples
>>> key_value_pair_filter(people, ['name', 'ab', 'position', 'professor'])
This would get all people for which their name contains the string ‘ab’ and whose position is professor and return them
- regolith.tools.latex_safe(s, url_check=True, wrapper='url')[source]¶
Make string latex safe
- Parameters:
s (str)
url_check (bool, optional) – If True check for URLs and wrap them, if False check for URL but don’t wrap, defaults to True
wrapper (str, optional) – The wrapper for wrapping urls defaults to url
- regolith.tools.make_bibtex_file(pubs, pid, person_dir='.')[source]¶
Make a bibtex file given the publications
- Parameters:
pubs (list of dict) – The publications
pid (str) – The person id
person_dir (str, optional) – The person’s directory
- regolith.tools.merge_collections_all(a, b, target_id)[source]¶
merge two collections into a single merged collection
for keys that are in both collections, the value in b will be kept
- Parameters:
a the inferior collection (will lose values of shared keys)
b the superior collection (will keep values of shared keys)
target_id str the name of the key used in b to dereference ids in a
- Returns:
the combined collection. Note that it returns a collection containing
all items from a and b with the items dereferenced in b merged with the
dereferenced items in a.
see also merge_intersection that returns collection that is just referenced
in both
Examples
>>> grants = merge_collections_all(self.gtx["proposals"], self.gtx["grants"], "proposal_id")
This would merge all entries in the proposals collection with entries in the grants collection for which “_id” in proposals has the value of “proposal_id” in grants, returning also unchanged any other entries that are not linked.
- regolith.tools.merge_collections_intersect(a, b, target_id)[source]¶
merge two collections such thta just the intersection is returned
for shared keys that are in both collections, the value in b will be kept
- Parameters:
a the inferior collection (will lose values of shared keys)
b the superior collection (will keep values of shared keys)
target_id str the name of the key used in b to dereference ids in a
- Returns:
the combined collection. Note that it returns a collection only containing
merged items from a and b that are dereferenced in b, i.e., the merged
intercept.
see also merge_collections_all that returns all items in a, b and the intersect
and merge_collections_superior that returns all items in b and the intercept
Examples
>>> grants = merge_collections_intesect(self.gtx["proposals"], self.gtx["grants"], "proposal_id")
This would merge all entries in the proposals collection with entries in the grants collection for which “_id” in proposals has the value of “proposal_id” in grants, returning just those items that have the dereference
- regolith.tools.merge_collections_superior(a, b, target_id)[source]¶
merge two collections into a single merged collection
for keys that are in both collections, the value in b will be kept
- Parameters:
a the inferior collection (will lose values of shared keys)
b the superior collection (will keep values of shared keys)
target_id str the name of the key used in b to dereference ids in a
- Returns:
the combined collection. Note that it returns a collection containing
all items from a and b with the items dereferenced in b merged with the
dereferenced items in a.
see also merge_intersection that returns collection that is just referenced
in both
Examples
>>> grants = merge_collections_all(self.gtx["proposals"], self.gtx["grants"], "proposal_id")
This would merge all entries in the proposals collection with entries in the grants collection for which “_id” in proposals has the value of “proposal_id” in grants, returning also unchanged any other entries that are not linked.
- regolith.tools.month_and_year(m=None, y=None)[source]¶
Creates a string from month and year data, if available.
- regolith.tools.number_suffix(number)[source]¶
returns the suffix that adjectivises a number (st, nd, rd, th)
- number: integer
The number. If number is not an integer, returns an empty string
- Returns:
suffix – The suffix (st, nd, rd, th)
- Return type:
string
- regolith.tools.print_task(task_list, stati, index=True)[source]¶
Print tasks in a nice format.
- Parameters:
task_list (list) – A list of tasks that will be printed.
stati (list) – Filter status of the task
- regolith.tools.remove_duplicate_docs(coll, key)[source]¶
find all docs where the target key has the same value and remove duplicates
The doc found first will be kept and subsequent docs will be removed
- Parameters:
target iterable of dicts – the list of documents
key string – the key that will be used to compare
- Return type:
The list of docs with duplicates (as described above) removed
- regolith.tools.search_collection(collection, arguments, keys=None)[source]¶
Retrieves a list of all documents from the collection where the fragment appears in any one of the given fields
- Parameters:
collection (generator) – The collection containing the documents
arguments (list) – The name of the fields to look for and their accompanying substring
keys (list, optional) – The name of the fields to return from the search. Defaults to none in which case only the id is returned
- Returns:
The collection containing the elements that satisfy the search criteria
- Return type:
generator
Examples
>>> search_collection(people, ['name', 'ab', 'position', 'professor'], ['_id', 'name'])
This would get all people for which their name contains the string ‘ab’ and whose position is professor. It would return the name and id of the valid entries
- regolith.tools.update_schemas(default_schema, user_schema)[source]¶
Merging the user schema into the default schema recursively and return the merged schema. The default schema and user schema will not be modified during the merging.
- Parameters:
default_schema (dict) – The default schema.
user_schema (dict) – The user defined schema.
- Returns:
updated_schema – The merged schema.
- Return type:
dict
- regolith.tools.validate_meeting(meeting, date)[source]¶
Validates a meeting by checking is it has a journal club doi, a presentation link, and a presentation title. This function will return nothing is the meeting is valid, otherwise it will raise a ValueError.
- Parameters:
meeting (dict) – The meeting object that needs to be validated
date (datetime object) – The date we want to use to see if a meeting has happened or not