| Title: | Speciation profiles for gases and aerosols |
|---|---|
| Description: | Access to the air pollutant emission profiles in US EPA SPECIATE (v5.2) and EU JRC SPECIEUROPE archives. More details in Simon et al (2010) doi:10.5094/APR.2010.026 and Pernigotti et al (2016) doi:10.1016/j.apr.2015.10.007, respectively. |
| Authors: | Sergio Ibarra-Espinosa [aut, cre] (ORCID: <https://orcid.org/0000-0002-3162-1905>), Karl Ropkins [aut] (ORCID: <https://orcid.org/0000-0002-0294-6997>) |
| Maintainer: | Sergio Ibarra-Espinosa <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.4.2 |
| Built: | 2026-06-06 05:54:02 UTC |
| Source: | https://github.com/atmoschem/respeciate |
Generic functions for use with respeciate object classes.
as.respeciate(x, ...) ## Default S3 method: as.respeciate(x, ...) ## S3 method for class 'respeciate' print(x, n = 6, ...) ## S3 method for class 'rsp_pls' print(x, n = NULL, ...) ## S3 method for class 'respeciate' plot(x, ...) ## S3 method for class 'rsp_pls' plot(x, ...) ## S3 method for class 'respeciate' summary(object, ...) ## S3 method for class 'respeciate' merge(x, y, ...)as.respeciate(x, ...) ## Default S3 method: as.respeciate(x, ...) ## S3 method for class 'respeciate' print(x, n = 6, ...) ## S3 method for class 'rsp_pls' print(x, n = NULL, ...) ## S3 method for class 'respeciate' plot(x, ...) ## S3 method for class 'rsp_pls' plot(x, ...) ## S3 method for class 'respeciate' summary(object, ...) ## S3 method for class 'respeciate' merge(x, y, ...)
x |
the |
... |
any extra arguments, mostly ignored except by
|
n |
when plotting or printing a multi-profile object, the maximum number of profiles to report. |
object |
like |
y |
a second data set, typically a |
These generic functions/methods generate typical outputs for
respeciate data sets and models:
When supplied a data.frame or similar,
as.respeciate attempts to coerce it into a
respeciate object.
When supplied a respeciate object, print manages its
appearance.
When supplied a respeciate object, plot provides a
basic plot output. This is currently wrapper for the respeciate
function rsp_plot_profile.
When supplied a respeciate object, summary generates
a summary table of profile information.
When supplied a respeciate object and a second respeciate-like
object, e.g. data.frame, respeciate object, etc,
merge attempts to merge them using common data columns. You
can refine the merge operation using additional arguments.
respeciate objects revert to
data.frames when not doing anything
package-specific, so you can still use them like data.frames
with other packages. This is useful if you have other ideas how to
plot more complex (multiple-profile, multiple-species)
data sets, and want to use graphics packages like lattice or
ggplot2.
Getting source profile(s) from the local respeciate archives.
rsp(..., include.refs = FALSE, source = "all") rsp_profile(...)rsp(..., include.refs = FALSE, source = "all") rsp_profile(...)
... |
The function assumes all inputs (except |
include.refs |
logical, if profile reference information should be
included when extracting the requested profile(s) from the archive, default
|
source |
character, the local archive to request a profile from:
|
rsp_profile or the short-hand rsp return an object of
respeciate class, a data.frame containing one or more profile
from the local respeciate archive.
The option include.refs adds profile source reference
information to the returned respeciate data set. The default option
is to not include these because some SPECIATE profiles have several
associated references and including these replicates records, once per
reference.
respeciate code is written to handle this but if you are developing
own methods or code and include references in any profile build you may be
biasing some analyses in favor of those multiple-reference profile unless
you check and account such cases.
For SPECIATE:
Simon, H., Beck, L., Bhave, P.V., Divita, F., Hsu, Y., Luecken, D., Mobley, J.D., Pouliot, G.A., Reff, A., Sarwar, G. and Strum, M., 2010. The development and uses of EPA SPECIATE database. Atmospheric Pollution Research, 1(4), pp.196-206.
For SPECIEUROPE:
Pernigotti, D., Belis, C.A., Spano, L., 2016. SPECIEUROPE: The European data base for PM source profiles. Atmospheric Pollution Research, 7(2), pp.307-314. DOI: https://doi.org/10.1016/j.apr.2015.10.007
SPECIATE and SPECIEUROPE regarding
data sources; and, rsp_find_profile
and rsp_find_species regarding archive searching.
## Not run: x <- rsp_profile(8833, 8850) plot(x) ## End(Not run)## Not run: x <- rsp_profile(8833, 8850) plot(x) ## End(Not run)
Functions to build composite respeciate profiles
rsp_average_profile generates an average composite
of a supplied multi-profile respeciate object.
rsp_average_profile(rsp, code = NULL, name = NULL, method = 1, ...)rsp_average_profile(rsp, code = NULL, name = NULL, method = 1, ...)
rsp |
A |
code |
required character, the unique profile code to assign to the average profile. |
name |
character, the profile name to assign to the average profile. If not supplied, this defaults to a collapsed list of the codes of all the profiles averaged. |
method |
numeric, the averaging method to apply: Currently only 1 (default)
|
... |
additional arguments, currently ignored |
rsp_average_profile returns a single profile average
version of the supplied respeciate profile.
In development function; arguments and outputs likely to be subject to change.
This is one of the very few respeciate functions that modifies the
WEIGHT_PERCENT column of the respectiate data.frame.
rsp function(s) to reconfigure data.frames (and similar
object classes) for use with data and functions in respeciate.
rsp_build_x(x, profile_id, profile_name, species_name, species_id, value, ...) rsp_build_simx(m, n = 1, ...)rsp_build_x(x, profile_id, profile_name, species_name, species_id, value, ...) rsp_build_simx(m, n = 1, ...)
x |
|
profile_name, profile_id
|
( |
species_name, species_id
|
( |
value |
( |
... |
(any other arguments) currently ignored. |
m |
|
n |
a numeric object, e.g. a |
rsp_builds attempt to build and return a respeciate-like
object that can be directly compared with data from respeciate.
rsp_build_x is the standard object builder.
rsp_build_simx builds a simulation of an x data set based on
the 'linear combination of profiles' model applied in conventional source
apportionment. (See below and rsp_pls_x)
If you want to compare your data with profiles in the respeciate archive,
you need respeciate conventions when assigning species names and
identifiers. We are working on options to improve on this (and
very happy to discuss if anyone has ideas), but current best suggestion is:
(1) identify the respeciate species code for each of the species in
your data set, and (2) assign these as species_code when rsp_building.
The function will then associate the species_name from respeciate
species records.
Functions for studying similarities (or dissimilarities) within respeciate data sets
rsp_distance_profile calculates the statistical distance
between respeciate profiles, and clusters profiles according to nearness.
rsp_distance_profile(rsp, output = c("plot", "report"))rsp_distance_profile(rsp, output = c("plot", "report"))
rsp |
A |
output |
Character vector, required function output: |
Depending on the output option, sp_distance_profile returns
one or more of the following: the correlation matrix, a heat map of the
correlation matrix.
Please note: function in development; structure and arguments may be subject to change.
Functions to combining respeciate data sets.
rsp_lbind binds two or more respeciate-like
objects. The default option is to stack the supplied data sets (e.g.
respeciate, data.frame, etc) like rbindlist
in data.table (or row_bind in dplyr). This matches columns by name
before stacking the supplied data sets.
rsp_lbind(...)rsp_lbind(...)
... |
(various) This function is intended to be quite flexible. All
supplied arguments are tested and handled as follows: |
rsp_lbind attempts to return a single stacked version of the
supplied data sets. If it is successful, the (stacked) data set is typically
returned as a respeciate object or a data.frame with a warning
if it is missing columns respeciate expects.
Dowle M, Srinivasan A (2023). data.table: Extension of 'data.frame'. R package version 1.14.8, https://CRAN.R-project.org/package=data.table.
Functions for studying relationships between species in respeciate data sets.
rsp_cor_species generates a by-species correlation
matrix of the supplied respeciate data sets.
rsp_cor_species( rsp, min.n = 3, cols = c("#80FFFF", "#FFFFFF", "#FF80FF"), na.col = "#CFCFCF", heatmap.args = TRUE, key.args = TRUE, report = "silent" )rsp_cor_species( rsp, min.n = 3, cols = c("#80FFFF", "#FFFFFF", "#FF80FF"), na.col = "#CFCFCF", heatmap.args = TRUE, key.args = TRUE, report = "silent" )
rsp |
|
min.n |
|
cols |
a series of |
na.col |
|
heatmap.args |
|
key.args |
|
report |
|
By default rsp_cor_species invisibly returns the calculated
correlation matrix a plots it as a heat map, but arguments including
heatmap and report can be used to modify function outputs.
rsp_eu and rsp_eu_ functions are
quick access wrappers to commonly requested SPECIEUROPE subsets.
rsp_eu() rsp_eu_pm10() rsp_eu_pm2.5()rsp_eu() rsp_eu_pm10() rsp_eu_pm2.5()
rsp_eu and rsp_eu_functions typically return
a respeciate data.frame of the requested profiles:
rsp_eu() returns all profiles in the local version of
SPECIEUROPE
rsp_eu_pm10 returns all SPECIEUROPE profiles classified as
PM10 (using Particle.Size=="PM10"), rsp_eu_pm10 for PM2.5
and so on...
rsp function(s) to export respeciate (and respeciate-like) objects to other software
rsp_export_esat( rsp, file.name = "file", index = "row.count", unc = 0.15, bad.values = "fill.1", output = c("con.csv", "unc.csv"), overwrite = FALSE, ... )rsp_export_esat( rsp, file.name = "file", index = "row.count", unc = 0.15, bad.values = "fill.1", output = c("con.csv", "unc.csv"), overwrite = FALSE, ... )
rsp |
( |
file.name |
( |
index |
( |
unc |
(various), if |
bad.values |
( |
output |
( |
overwrite |
( |
... |
other arguments, currently ignored. |
rsp_build_esat makes files that can be used as inputs with ESAT.
output options: 'con.csv' and 'unc.csv' (both required by ESAT).
rsp_exports attempt to build and save files suitable for use
outside r.
Functions that provide respeciate
source information.
rsp_find_profile searches the currently installed respeciate
data sets for profile records.
rsp_find species searches the currently installed respeciate
data sets for species records.
rsp_find_profile( ..., by = "keywords", partial = TRUE, source = "all", ref = NULL ) rsp_profile_info(...) rsp_find_species( ..., by = ".species", partial = TRUE, source = "all", ref = NULL ) rsp_species_info(...)rsp_find_profile( ..., by = "keywords", partial = TRUE, source = "all", ref = NULL ) rsp_profile_info(...) rsp_find_species( ..., by = ".species", partial = TRUE, source = "all", ref = NULL ) rsp_species_info(...)
... |
character(s), any search term(s) to use when searching
the local respeciate archive for relevant records using
|
by |
character, the section of the archive to
search, by default |
partial |
logical, if |
source |
character, the data set to search: |
ref |
any |
rsp_profile_info returns a data.frame of
profile information, as a respeciate object.
rsp_species_info returns a data.frame of
species information as a respeciate object.
For SPECIATE:
Simon, H., Beck, L., Bhave, P.V., Divita, F., Hsu, Y., Luecken, D., Mobley, J.D., Pouliot, G.A., Reff, A., Sarwar, G. and Strum, M., 2010. The development and uses of EPA SPECIATE database. Atmospheric Pollution Research, 1(4), pp.196-206.
For SPECIEUROPE:
Pernigotti, D., Belis, C.A., Spano, L., 2016. SPECIEUROPE: The European data base for PM source profiles. Atmospheric Pollution Research, 7(2), pp.307-314. DOI: https://doi.org/10.1016/j.apr.2015.10.007
SPECIATE and SPECIEUROPE
## Not run: profile <- "Ethanol" pr <- rsp_find_profile(profile) pr species <- "Ethanol" sp <- rsp_find_species(species) sp ## End(Not run)## Not run: profile <- "Ethanol" pr <- rsp_find_profile(profile) pr species <- "Ethanol" sp <- rsp_find_species(species) sp ## End(Not run)
rsp_id_ functions generate a vector of assignment
terms and can be used to subset or condition a supplied (re)SPECIATE
data.frame.
Most commonly, the rsp_id_ functions accept a single input, a
respeciate data.frame and return a logical vector of
length nrow(x), identifying species of interest as
TRUE. So, for example, they can be used when
subsetting in the form:
subset(rsp, rsp_id_nalkane(rsp))
... to extract just n-alkane records from a supplied respeciate
object rsp.
However, some accept additional arguments. For example, rsp_id_copy
also accepts a reference data set, ref, and a column identifier,
by, and tests rsp$by %in% unique(ref$by).
rsp_id_copy(rsp, ref = NULL, by = ".species.id") rsp_id_nalkane(rsp) rsp_id_btex(rsp) rsp_id_pah16(rsp)rsp_id_copy(rsp, ref = NULL, by = ".species.id") rsp_id_nalkane(rsp) rsp_id_btex(rsp) rsp_id_pah16(rsp)
rsp |
a |
ref |
( |
by |
( |
rsp_id_copy outputs can be modified but, by default, it
identifies all species in the supplied reference data set.
rsp_id_nalkane identifies (straight chain) C1 to C40 n-alkanes.
rsp_id_btex identifies the BTEX group of aromatic hydrocarbons
(benzene, toluene, ethyl benzene, and M-, O- and P-xylene).
Functions that provide respeciate
source information.
rsp_info generates a brief version report for the currently installed
respeciate data sets.
rsp_info()rsp_info()
rsp_info provides a brief version information report on the
currently installed respeciate archive.
For SPECIATE:
Simon, H., Beck, L., Bhave, P.V., Divita, F., Hsu, Y., Luecken, D., Mobley, J.D., Pouliot, G.A., Reff, A., Sarwar, G. and Strum, M., 2010. The development and uses of EPA SPECIATE database. Atmospheric Pollution Research, 1(4), pp.196-206.
For SPECIEUROPE:
Pernigotti, D., Belis, C.A., Spano, L., 2016. SPECIEUROPE: The European data base for PM source profiles. Atmospheric Pollution Research, 7(2), pp.307-314. DOI: https://doi.org/10.1016/j.apr.2015.10.007
SPECIATE and SPECIEUROPE
## Not run: rsp_info() ## End(Not run)## Not run: rsp_info() ## End(Not run)
rsp_match_profile compares a supplied respeciate
profile (or similar data set) and a reference set of supplied profiles
and attempts to identify nearest matches on the
basis of similarity.
rsp_match_profile( rsp, ref, matches = 10, rescale = 5, min.n = NULL, method = "sid * srd", self.test = FALSE, ..., output = "summary" )rsp_match_profile( rsp, ref, matches = 10, rescale = 5, min.n = NULL, method = "sid * srd", self.test = FALSE, ..., output = "summary" )
rsp |
A |
ref |
A |
matches |
Numeric (default 10), the maximum number of profile matches to report. |
rescale |
Numeric (default 5), the data scaling method to apply before
comparing |
min.n |
Numeric (or |
method |
Character (default 'sid * srd'), the ranking metric used to
rank profile matches. The function calculates several matching metrics:
'pd', the Pearson's Distance (1 - Pearson's correlation coefficient),
'srd', like pd but using the Spearman Ranked data correlation coefficient,
and 'sid', the Standardized Identity Distance (See References). All the
metrics tend to zero for better matches, and the |
self.test |
Logical (default FALSE). The match process self-tests by adding
|
... |
Additional arguments, typically ignore but sometimes used for
function development. Currently, testing |
output |
Character, output options, including: |
By default rsp_match_profile returns a fit report summary: a
data.frame of up to matches fit reports for the nearest
matches to profiles from the reference profile data set, ref. (See
also output above for other options). If several options are requested,
earlier options are report (e.g. using print or plot) and only
the final option is returned.
Distance metrics are based on recommendations by Belis et al (2015) and as implemented in Mooibroek et al (2022):
Belis, C.A., Pernigotti, D., Karagulian, F., Pirovano, G., Larsen, B.R., Gerboles, M., Hopke, P.K., 2015. A new methodology to assess the performance and uncertainty of source apportionment models in intercomparison exercises. Atmospheric Environment, 119, 35–44. https://doi.org/10.1016/j.atmosenv.2015.08.002.
Mooibroek, D., Sofowote, U.M. and Hopke, P.K., 2022. Source apportionment of ambient PM10 collected at three sites in an urban-industrial area with multi-time resolution factor analyses. Science of The Total Environment, 850, p.157981. http://dx.doi.org/10.1016/j.scitotenv.2022.157981.
Functions for padding respeciate objects.
rsp_pad pads a supplied respeciate profile data set
with profile and species meta-data.
rsp_pad(rsp, pad = "standard", drop.nas = TRUE)rsp_pad(rsp, pad = "standard", drop.nas = TRUE)
rsp |
A |
pad |
character, type of meta data padding, current options
|
drop.nas |
logical, discard any rows where the |
rsp_pad returns supplied respeciate data set, with
requested additional profile and species meta-data added as additional
data.frame columns. See Note.
Some data handling can remove respeciate meta-data,
and rsp_pads provide a quick rebuild/repair. For example,
rsp_dcasting to a (by-species or by-profile) widened
form strips some meta-data, and padding is used as part of the
rsp_melt_wide to re-add this meta-data
when returning the data set to its standard long form.
General plots for respeciate objects.
rsp_plot functions generate plots for supplied
respeciate data sets.
rsp_plot_profile( rsp, id, multi.profile = "group", order = TRUE, log = FALSE, ..., silent = FALSE, output = "default" ) rsp_plot_species( rsp, id, multi.species = "group", order = FALSE, log = FALSE, ..., silent = FALSE, output = "default" ) rsp_plot_match( rsp, ref = NULL, plot.type = 2, log = FALSE, ..., output = "plot" )rsp_plot_profile( rsp, id, multi.profile = "group", order = TRUE, log = FALSE, ..., silent = FALSE, output = "default" ) rsp_plot_species( rsp, id, multi.species = "group", order = FALSE, log = FALSE, ..., silent = FALSE, output = "default" ) rsp_plot_match( rsp, ref = NULL, plot.type = 2, log = FALSE, ..., output = "plot" )
rsp |
A |
id |
numeric, the indices of profiles or species to use when
plotting with |
multi.profile |
character, how |
order |
logical, order the species in the profile(s) by relative abundance before plotting. |
log |
logical, log y scale when plotting. |
... |
any additional arguments, typically passed on the lattice plotting functions. |
silent |
logical, hide warnings when generating plots (default
|
output |
character, output method, one of: 'plot' to return just the requested plot; 'data' to return just the data; and, c('plot', 'data') to plot then return the data invisibly (default). |
multi.species |
character, like |
ref |
|
plot.type |
numeric, option if the |
rsp_plot graph, plot, etc usually as a trellis object.
These functions are currently in development, so may change.
Most respeciate plots make extensive use of
lattice and latticeExtra code:
Sarkar D (2008). Lattice: Multivariate Data Visualization with R. Springer, New York. ISBN 978-0-387-75968-5, http://lmdvr.r-forge.r-project.org.
Sarkar D, Andrews F (2022). latticeExtra: Extra Graphical Utilities Based on Lattice. R package version 0.6-30, https://CRAN.R-project.org/package=latticeExtra.
They also incorporate ideas from loa:
Ropkins K (2023). loa: various plots, options and add-ins for use with lattice. R package version 0.2.48.3, https://CRAN.R-project.org/package=loa.
Functions for Positive Least Squares (PSL) fitting of respeciate profiles
rsp_pls_x builds PSL models for supplied profile(s) using
the nls function, the 'port' algorithm and a lower
limit of zero for all model outputs to enforce the positive fits. The
modeled profiles are typically from an external source, e.g. a
measurement campaign, and are fit as a linear additive series of reference
profiles, here typically from respeciate, to provide a measure of
source apportionment based on the assumption that the profiles in the
reference set are representative of the mix that make up the modeled
sample. The pls_ functions work with rsp_pls_x
outputs, and are intended to be used when refining and analyzing
these PLS models. See also pls_plots for PLS model plots.
rsp_pls_x(x, m, power = 1, ...) pls_report(pls) pls_test(pls) pls_fit_species( pls, species, power = 1, refit.profile = TRUE, as.marker = FALSE, drop.missing = FALSE, ... ) pls_refit_species( pls, species, power = 1, refit.profile = TRUE, as.marker = FALSE, drop.missing = FALSE, ... ) pls_rebuild( pls, species, power = 1, refit.profile = TRUE, as.marker = FALSE, drop.missing = FALSE, ... )rsp_pls_x(x, m, power = 1, ...) pls_report(pls) pls_test(pls) pls_fit_species( pls, species, power = 1, refit.profile = TRUE, as.marker = FALSE, drop.missing = FALSE, ... ) pls_refit_species( pls, species, power = 1, refit.profile = TRUE, as.marker = FALSE, drop.missing = FALSE, ... ) pls_rebuild( pls, species, power = 1, refit.profile = TRUE, as.marker = FALSE, drop.missing = FALSE, ... )
x |
A |
m |
A |
power |
A numeric, an additional factor to be added to
weightings when fitting the PLS model. This is applied in the form
|
... |
additional arguments, typically ignored or passed on to
|
pls |
A |
species |
for |
refit.profile |
(for |
as.marker |
for |
drop.missing |
for |
rsp_pls_x returns a list of nls models, one per
profile/measurement set in x. The pls_ functions work with
these outputs. pls_report generates a data.frame of
model outputs, and is used of several of the other pls_
functions. pls_fit_species, pls_refit_species and
pls_fit_parent return the supplied rsp_pls_profile output,
updated on the basis of the pls_ function action.
pls_plots (documented separately) produce various plots
commonly used in source apportionment studies.
This implementation of PLS applies the following modeling constraints:
1. It generates a model of x that is positively constrained linear
product of the profiles in m, so outputs can only be
zero or more. Although the model is generated using nls,
which is a Nonlinear Least Squares (NLS) model, the fitting term applied
in this case is linear.
2. The model is fit in the form:
Where X is the data set of measurements, input x in rsp_pls_x,
M (m) is data set of reference profiles, and N is the data set of
source contributions, the source apportion solution, to be solved by
minimising e, the error terms.
3. The number of species in x must be more than the number of
profiles in m to reduce the likelihood of over-fitting.
The pls_plot functions are intended for use with PLS models built
using rsp_pls_profile (documented separately). They generate some
plots commonly used with source apportionment model outputs.
pls_plot(pls, plot.type = 1, ..., output = "default") pls_plot_profile(pls, plot.type = 1, log = FALSE, ..., output = "default") pls_plot_species(pls, id, plot.type = 1, ..., output = "default")pls_plot(pls, plot.type = 1, ..., output = "default") pls_plot_profile(pls, plot.type = 1, log = FALSE, ..., output = "default") pls_plot_species(pls, id, plot.type = 1, ..., output = "default")
pls |
A |
plot.type |
numeric, the plot type if multiple options are available. |
... |
other arguments, typically passed on to the associated
|
output |
character, output method, one of: 'plot' to return just the requested plot; 'data' to return just the data; and, c('plot', 'data') to plot then return the data invisibly (default). |
log |
(for |
id |
numeric or character
identifying the species or profile to plot. If numeric, these are treated
as indices of the species or profile, respectively, in the PLS model; if
character, species is treated as the name of species and profile is treated
as the profile code. Both can be concatenated to produce multiple plots and
the special case |
pls_plots produce various plots commonly used in source
apportionment studies.
Functions for rescaling respeciate data sets
rsp_rescale rescales the percentage weight records in
a supplied respeciate profile data set. This can be by profile or species
subsets, and rsp_rescale_profile and rsp_rescale_species provide
short-cuts to these options.
rsp_rescale(rsp, method = 2, by = "species") rsp_rescale_profile(rsp, method = 1, by = "profile") rsp_rescale_species(rsp, method = 2, by = "species")rsp_rescale(rsp, method = 2, by = "species") rsp_rescale_profile(rsp, method = 1, by = "profile") rsp_rescale_species(rsp, method = 2, by = "species")
rsp |
A |
method |
numeric, the rescaling method to apply:
1 |
by |
character, when rescaling |
rsp_rescale and rsp_rescale return the
respeciate profile with the percentage weight records rescaled using
the requested method. See Note.
Data sometimes needs to be normalised, e.g. when applying some
statistical analyses. Rather than modify source information in
SPECIATE and SPECIEUROPE, respeciate creates a
duplicate column .value which is modified by operations
like sp_rescale_profile and sp_rescale_species. This means
rescaling is always applied to the source information, rather than
rescaling an already rescaled value, and the EPA records are retained
unaffected. So, the original source information can be easily recovered.
Dowle M, Srinivasan A (2023). data.table: Extension of 'data.frame'. R package version 1.14.8, https://CRAN.R-project.org/package=data.table.
Functions for reshaping respeciate profiles
rsp_dcast and rsp_melt_wide reshape supplied
respeciate profile(s). rsp_dcast converts these from their supplied
long form to a widened form, dcasting the data set by either species
or profiles depending on the widen setting applied.
rsp_dcast_profile, rsp_dcast_profile_id,
rsp_dcast_species and rsp_dcast_species_id are wrappers for
these options. rsp_melt_wide attempts to return a previously widened data
set to the original long form.
rsp_dcast(rsp, widen = "species") rsp_dcast_profile(rsp, widen = "profile") rsp_dcast_profile_id(rsp, widen = "profile.id") rsp_dcast_species(rsp = rsp, widen = "species") rsp_dcast_species_id(rsp = rsp, widen = "species.id") rsp_melt_wide(rsp, pad = FALSE, drop.nas = FALSE)rsp_dcast(rsp, widen = "species") rsp_dcast_profile(rsp, widen = "profile") rsp_dcast_profile_id(rsp, widen = "profile.id") rsp_dcast_species(rsp = rsp, widen = "species") rsp_dcast_species_id(rsp = rsp, widen = "species.id") rsp_melt_wide(rsp, pad = FALSE, drop.nas = FALSE)
rsp |
A |
widen |
character, when widening |
pad |
logical or character, when |
drop.nas |
logical, when |
rsp_dcast returns the wide form of the supplied
respeciate profile. rsp_melt_wide
returns the (standard) long form of a previously widened profile.
Conventional long-to-wide reshaping of data, or dcasting, can
be slow and memory inefficient. So, respeciate uses the
data.table::dcast
method. The rsp_dcast_species method,
applied using widen='species', is effectively:
dcast(..., .profile.id+.profile~.species, value.var=".value")
And, the alternative widen='profile':
dcast(..., .species.id+.species~.profile, value.var=".value")
respeciate uses a local version of the SPECIATE and
SPECIEUROPE weight measurements .value, so the EPA and
JCR source information can easily be recovered. See also
rsp_rescale_profile.
Dowle M, Srinivasan A (2023). _data.table: Extension of 'data.frame'_. R package version 1.14.8, <https://CRAN.R-project.org/package=data.table>.
rsp_us_ functions are quick access wrappers to commonly
requested SPECIATE subsets.
rsp_us_gas() rsp_us_other() rsp_us_pm() rsp_us_pm.ae6() rsp_us_pm.ae8() rsp_us_pm.cr1() rsp_us_pm.simplified()rsp_us_gas() rsp_us_other() rsp_us_pm() rsp_us_pm.ae6() rsp_us_pm.ae8() rsp_us_pm.cr1() rsp_us_pm.simplified()
rsp_us_ functions typically return a respeciate
data.frame of the requested profiles.
For example:
rsp_us_gas() returns all gaseous profiles in SPECIATE
(PROFILE_TYPE == 'GAS').
rsp_us_pm returns all particulate matter (PM) profiles in SPECIATE
not classified as a special PM type (PROFILE_TYPE == 'PM').
The special PM types are subsets profiles intended for special
applications, and these include rsp_us_pm.ae6 (type PM-AE6),
rsp_us_pm.ae8 (type PM-AE8), rsp_us_pm.cr1 (type
PM-CR1), and rsp_us_pm.simplified (type PM-Simplified).
rsp_us_other returns all profiles classified as other in SPECIATE
(PROFILE_TYPE == 'OTHER').
the SPECIATE data set is a local version of the EPA's SPECIATE repository of organic gas and particulate matter (PM) speciation profiles of air pollution sources.
Currently using version 5.4 as of 2025-11-18.
SPECIATESPECIATE
A ( 13 long) 'list' object
The main data.frame of profile-specific meta-data,
with one row per profile, key term PROFILE_CODE.
The main data.frame of individual record meta-data,
with one row per species in each profile, key terms PROFILE_CODE
and SPECIES_ID linking PROFILES and
SPECIES_PROPERTIES.
The main data.frame of species-specific
meta-data, with one row per species, key term SPECIES_ID.
The data.frame linking profile and
reference meta-data, one row per references per profile, key terms
PROFILE_CODE and REF_Code.
The main data.frame of references for profile
source meta-data, one row per reference, key term REF_Code.
Currently not documented.
https://www.epa.gov/air-emissions-modeling/speciate
Simon, H., Beck, L., Bhave, P.V., Divita, F., Hsu, Y., Luecken, D., Mobley, J.D., Pouliot, G.A., Reff, A., Sarwar, G. and Strum, M., 2010. The development and uses of EPA SPECIATE database. Atmospheric Pollution Research, 1(4), pp.196-206.
The SPECIEUROPE data set is a local version of the European Commission (EC) Joint Research Centre JRC's repository of particulate matter (PM) speciation profiles of European air pollutant sources.
Currently using version 3.0 as of 2025-11-19.
SPECIEUROPESPECIEUROPE
A ( 3 long) 'list' object
The main SPECIEUROPE data set
The source citation, to be used whenever this data is used.
The SPECIEUROPE project website link
https://source-apportionment.jrc.ec.europa.eu/
Pernigotti, D., Belis, C.A., Spano, L., 2016. SPECIEUROPE: The European data base for PM source profiles. Atmospheric Pollution Research, 7(2), pp.307-314. DOI: https://doi.org/10.1016/j.apr.2015.10.007