Read references/citations from files

read_ref() read references/citations from files. When more than one file is used, the function binds the references from each file.

At the moment, read_ref() works only with PubMed and RIS (Research Information Systems) formats.

read_ref(file = file.choose(), lookup = NULL, sep = " | ")

Arguments

file: (optional) a character object indicating PubMed, RIS or ZIP file names. If not assigned, a dialog window will be open enabling the user to search and select a file (only for interactive sessions).
lookup: (optional) a string indicating the database provider from which the reference/citation file was exported, or a data.frame object containing instructions on how to rename and rearrange the tags/variables. See the Details section to learn more (default: NULL).
sep: (optional) a string indicating the separator to be used when combining values with the same tag/variable (default: " | ").

Value

A data.frame object with the citations/references found in the file argument.

Details

`sep` argument

The sep argument is only used if the reference/citation files have duplicated tags. In those cases, read_ref() will join all the same tag values using the sep value.

Example:

read_ref(file, sep = " | ")

AU  - Stoykova B
AU  - Slota C
AU  - Doward L

Joined tags: AU - Stoykova B | Slota C | Doward L

`lookup` argument

read_ref() allows you to rename and rearrange the tags/variables from reference files. You can create your own settings for that task or use the settings provided by the refstudio package. Use lookup = NULL (default) to preserve the original tag names.

To use the settings from the refstudio package, choose and assign one of the following values to the lookup argument, accordingly to the database provider from which the reference file was exported. You can see the refstudio package settings in refstudio::ris_tags or at https://bit.ly/3efSgHr.

"general": A general lookup table for RIS files.
"apa": for APA (American Psychology Association).
"ebsco": for EBSCO (Elton Bryson Stephens Company).
"embase": for Embase (Excerpta Medica dataBASE)
"pubmed": for PubMed.
"scopus": for Scopus.
"wos": for Web of Science.

To use you own settings, you will need to assign a data.frame object to the lookup argument. This data.frame need to have the 3 columns below:

tag: a character column indicating the tag/variable.
order: an integer column, with greater than 0 values, indicating the columns order.
name: a character column indicating the name to replace the tag/variable indicated in the tag column.

Note that read_ref() will perform a cleaning process to the lookup data frame. This process involves:

Removing rows with NA values found in the tag, order and/or name columns.
Removing any duplicated tag values, considering the last values after arranging the table by the order column.

Also, if one or more tag values have the same name value, read_ref() will combine them with the separator assigned in the sep argument.

Reading ZIP files

read_ref() also allows you to read ZIP compacted files. Just assign the ZIP file in the file argument.

Examples

if (FALSE) {
if (require_namespace("utils", quietly = TRUE)) {
    file <- raw_data()[grep("_pubmed_", raw_data())]
    file <- raw_data(file)

    View(read_ref(file, "pubmed"))}}