The Research Impact Evaluation System (RIES) allows for the comparison of a set of research outputs against a set of calculated benchmarks. It provides on-demand reporting for research institutions based on classification systems such as the ANZSRC fields of research (FoR) and ASCED fields of education (FoE), following the methodology planned for the Excellence in Research Australia (ERA) 2023 exercise.
Excellence in Research for Australia (ERA) was a periodic assessment by the Australian Research Council (ARC) , evaluating 42 Australian higher education providers (HEPs) across 236 fields of research. Institutions were ranked based on their research activity in each field, compared to local and world benchmarks to measure relative performance and assign ratings. This citation-focused assessment relied on the participating HEPs to self-report their affiliated research outputs, with each research output assigned and apportioned to up to three FoR codes. For citation based disciplines research outputs also had to be published in a journal included in the ARC journal list, and indexed by a citation provider.
The Curtin Open Knowledge Initiative (COKI) at Curtin University collects and aggregates publication metadata from publicly available sources such as Crossref, Unpaywall, OpenCitations, and OpenAlex. These datasets are used for further analysis by the COKI team, with a focus on Open Access publication. Guided by published and proposed ERA methods and using journal-level metadata from the ERA 2023 Journal List, COKI developed RIES to deliver on-demand, ERA-like reporting for any set of research institutions against ANZSRC FoRs and ASCED FoEs. While RIES currently focuses on citation metrics, future modifications will explore different analysis methods and their impacts on performance rankings and ratings.
The code for the RIES workflows is available at the COKI RIES GitHub project.
RIES assesses relative research performance and assigns rankings by comparing citation metrics against calculated benchmarks. Following the methods from the ERA 2018 guidelines and proposed ERA 2023 methodology, RIES calculates average citations per paper for different groups (local, world, and high performing institutions). The benchmarks vary based on the institutions included in the data used to compute them.
Following the construction of the benchmarks, the RIES workflow then proceeds to analysis. The analysis phase builds a subset of indicators that are visualised in the RIES Dashboards, selected under the ‘Metric’ Option. Users have the ability to select various options, such as Classification (FoR or FoE), Metric (performance ratings, relative citation impact, and research outputs), Evaluation Years (the ability to select year ranges of 1, 3, 5, 6 and 10 years), Assignment Method (automatic or institutional assignment of research outputs and FoRs), and Min Publications (the ability to set a low volume threshold).
In contrast to the ERA approach, which credits works based on affiliation at the time of the census (a “census” approach), the RIES automatic approach credits works based on affiliation at the time of publication (a “by-line” approach). For the automatic “by-line” approach implemented by RIES, the RIES workflow does not have access to high-resolution FoR apportionment data (submitted to the Australian Research Council by individual Higher Education Providers (HEPS)). In its absence, the FoR assignments are instead inferred (inherited) from the containing journal with apportionment being uniformly distributed. This is the default Assignment option for the dashboard.
RIES is also able to ingest and analyse institutional outputs via a “census” approach, whereby institutions provide a list of their own manually curated institutional outputs (where works are credited based on affiliation at the time of publication, with FoR apportionment data also supplied). These are displayed on the ‘Institutional’ option (select ‘Institutional’ from the ‘Assignment Method’ option). Select the ‘Automatic’ option to change back to the default Automatic view.
3.1 WORKFLOW
The basic workflow of RIES involves the aggregation of publicly available external datasets, the calculation of performance benchmarks, and the execution of multiple analysis streams that results in the visualisation of indicators in the RIES Dashboards (Figure 1).
Figure 1. The high-level RIES workflow.
3.2 RIES DATASETS
The first stage of the RIES workflow is to collect raw data from external sources, conduct some minor transformations, then load data into the RIES data warehouse. These raw data tables are then transformed into a set of seven core datasets.
Dataset | Role | Source |
---|---|---|
ISSNs | A mapping between ISSN and ISSN-L values | ISSN International Centre |
FoRs | A list of ANZSRC field of research codes used in ERA (2008 and 2020 available, 2020 used by default in the workflow) | ISSN International Centre |
FoEs | A list of ASCED field of education codes | Mapped to FoRs |
RORs | A list of institutional identifiers | Research Organization Registry |
HEPs | A list of Australian Higher Education Providers used in ERA | Australian Research Council |
Journals | A list of journals used in ERA (2018 and 2023 journal lists available, 2023 journal list used by default in the workflow) | Australian Research Council |
Papers | A set of publication metadata, indexed by DOI | COKI (containing data from Crossref Metadata, Unpaywall, OpenCitations, and OpenAlex |
Table 1. Core datasets used in the RIES workflows
3.3 GROUPING AND BENCHMARKING
In order to assess relative performance and assign rankings and performance categories, the RIES workflow compares grouped citation metrics against benchmarks. Guided by published ERA methods from 2018 (ERA 2018 Evaluation Handbook) and 2023 (ERA 2023 Benchmarking and Rating Scale – Consultation Paper), the system constructs sets of benchmarks, each of which computes average citations per paper as the metric, depending on the grouping. The difference is determined by the set of institutions from which outputs are drawn to compute the benchmarks:
For detailed information on benchmark calculations, please see Appendix II.
3.4 COMPILATION OF INDICATORS
Following the construction of the benchmarks, the RIES workflow then proceeds to build a subset of ‘Metric’ indicators (calculated from ERA 2018 and ERA 2023 methodology and currently limited to journal articles). The resulting indicators apply a qualitative activity rating to institutions for each FoR or FoE in which the institution is active. It additionally reports aggregated statistics for institutions and for fields of research. Three broad types of indicators are available to select from the ‘Metric’ drop-down in the Options box:
For detailed information on the compilation of the indicators, please see Appendix III.
3.5 WORKFLOW LIMITATIONS
RIES uses the Registry of Research (ROR) identifiers assigned to institutions to identify journal articles with a recorded affiliation to the institution. The institutional ROR assignment to journal articles is obtained from OpenAlex data. If journal articles are misassigned by OpenAlex, they will not be detected by RIES. Journal articles must also be identified by a CrossRef Digital Object Identifier (DOI). The RIES journal articles are then filtered to those linked to a journal in the ERA 2023 journal list. The ERA 2023 journal list uses ISSN/L identifiers to identify journals, and if these identifiers do not map due to imperfect intersection between the ERA 2023 journal list and the papers dataset collated by COKI (see Table 1), journal articles will not be detected by RIES.
In ERA, if an FoR was linked to less than 75 indexed papers (across the analysis period), then a warning was generated, suggesting that the user considers centile and RCI class analysis instead. This is not currently implemented in the RIES workflow. Additionally, in ERA, if a 4-digit FoR was linked to less than 250 articles, between all Australian HEPs combined (across the analysis period), then a low-volume warning was generated. This is not currently implemented in the RIES workflow.
As RIES uses formally recorded publication metadata to credit works to institutions (with uniform FoR apportionment), it is not able to identify the same set of works as the traditional manual census approach institutions may use (as the latter requires internal knowledge of current staff identities). For the ‘Institutional’ assignment option, where institutions provide their own manually curated outputs and FoR apportionment, the apportionment supplied by the institution for their research outputs is used instead of the automatic uniform apportionment used for all other global and Australian research outputs. This means that the results for the institution dashboard must be interpreted with care.
Compared to the automatic dashboard, different methodologies are used in the apportionment of FoRs (supplied from the ERA 2023 journals list for the automatic dashboard vs manually assigned by the institution for the institution dashboard). Similarly, the assignment of research outputs is different in each (by-line for the automatic vs census for the institutional). This means that the two assignment options show the performance of different sets of research articles overall and assigned to each field category.
This results in a general trend of the institutional assignment option displaying higher values for the metric indicators when compared to the automatic assignment option, as the results are very highly influenced by the selection of journal articles and their apportionment. For example, for the FoR ‘Indigenous Studies’ (45), there are a very small number of journal articles assigned to this FoR in the automatic method in the calculation of the global average citations per paper for this FoR. An institution manually assigning journal articles to this FoR would very likely be rated very highly in the institutional assignment.
4.1 CLASSIFICATION
Clicking on the Classification drop-down in the Options box presents a list of options for the available classification systems to be displayed in the Dashboards:
Fields of Research
The ANZSRC Fields of Research (FoR) classification systems categorises research activities into a hierarchy of broad 2-digit fields and more specific 4-digit fields. Journal articles can be assigned to up to three FoRs. The 2-digit codes include journal articles published in journals mapped to 2-digit FoR codes as well as articles published in journals mapped to 4-digit FoR codes.
The 4-digit codes only include journal articles published in journals mapped to 4-digit FoR codes. Each journal article can be apportioned to up to three FoRs, with the apportionments being uniformly distributed in the RIES automatic by-line approach (‘Automatic’ Assignment Method Option). Click on the arrows to expand the 2-digit codes into the 4-digit codes (or select ‘expand all’ to expand all FoRs or FoEs at once).
Fields of Education
The ASCED Fields of Education (FoE) classification system for Australia categorises educational activities into a hierarchy of broad 2-digit fields and more specific 4-digit fields. Journal articles are first assigned to FoR's, which have then been mapped to the equivalent FoE.
4.2 METRIC
Clicking on the Metric drop-down in the Options box presents a list of available indicators that can be displayed in the Dashboards.
PERFORMANCE RATING OPTIONS
Rating (Standard) - Performance rating based on the historical ERA method (used in ERA 2018). It uses a five-point scale for international comparison based on the world static Relative Citation Impact (RCI) benchmarks.
Rating (Option A) - A five-point performance rating system proposed for ERA 2023 for international comparison, based on both the High Performance Indicator and world dynamic Relative Citation Impact (RCI) benchmarks.
Rating (Option B) - A six-point performance rating system proposed for ERA 2023, based on both the High Performance Indicator and world dynamic Relative Citation Impact (RCI) benchmarks. There is more detail at the top of the scale to distinguish extremely high performance.
RELATIVE CITATION IMPACT OPTIONS
RCI values are grouped into three broad categories:
RESEARCH OUTPUTS OPTIONS
4.3 MINIMUM PUBLICATIONS
Users have the ability to set a minimum publication threshold (also known as the low volume threshold). Under the ERA process, to qualify for being rated in a given field, an institution was required to produce a minimum number of outputs (50 over 6 years). This was referred to as the low volume threshold (LVT). Users may dynamically set a minimum publication threshold value to apply over the selected evaluation years size.
4.4 EVALUATION YEARS
Accurate metrics are provided with one-year resolution, with the benchmarks calculated on a field (FoR or FoE) and single year basis. However, an evaluation year size (1, 3, 5, 6, 10) can be selected to assess metrics across a longer time frame. For example, a year size of three will sum outputs and citations, recompute rankings and apply the minimum publications threshold across a three-year sliding year range. Caution: for the qualitative benchmarks used to calculate the rankings displayed in the performance rating, relative citation impact, and research output ranking options, the current version of the dashboard uses weighted averages (determined from the number of outputs and their respective apportionments to different FoRs) to dynamically recompute values across the year size. This may differ from a full database recomputation of benchmarks at the chosen year size. A future version will provide accurate precomputed benchmarks for each of the available evaluation year size options.
4.5 ASSIGNMENT METHOD
Clicking on the Assignment Method drop-down in the Options box presents a list of options for the available assignment method options to be displayed in the Dashboards:
Automatic
The automatic assignment method displays the results for the automatic “by-line” approach implemented by RIES. Research outputs are assigned to institutions based on the ROR identifiers assigned to author affiliations obtained from OpenAlex data. As the RIES workflow does not have access to high-resolution FoR apportionment data (usually submitted to the ARC by individual HEPS), the FoR assignments for research outputs are instead inferred (inherited) from the containing journal with apportionment being uniformly distributed.
Institutional
The institutional assignment method displays the results for the manually curated “census approach”, where institutions have provided a list of their own manually curated institutional outputs (where works are credited based on affiliation at the time of publication, with FoR apportionment data also supplied). If institutions have not provided their own curated list of outputs, this option will remain unavailable.
4.6 INSTITUTIONS
When any of the three ‘Local Rank’ Metric indicator options are selected, users have the option to select individual Australian institutions to compare against. Multiple institutions can be selected, and the results will be updated to show the local rank for the subset of institution's selected.
4.7 CONSORTIA
When any of the three ‘Local Rank’ Metric indicator options are selected, users have the option to select university consortia to compare against. The results will be updated to show the local rank for the subset of institution's selected. Individual institutions may also be de-selected from a consortia.
4.8 DATA DOWNLOAD
Clicking on the ‘Download data’ button will download a csv of the data shown in the dashboard table. Note that the data in the plot is not included in this file.
4.9 PRINT
Clicking on the ‘Print’ button will download a pdf report of the data shown in the dashboard table. Note, there may be some compatability issues with printing to pdf using the Firefox browser, Chrome or Edge are recommended.
5.1 OUTPUTS
The Outputs Table is displayed below the main Dashboard on the RIES Dashboards page. The individual research outputs included in the RIES Dashboards are listed here, and are able to be filtered by:
The Outputs Table has the following functionality:
The Data Updated date below the Dashboard and Outputs Table refers to the creation date of the datasets used in the analysis.