*** PLEASE READ THIS BEFORE RUNNING SMOKE. *** == 1. Introduction == These packages contain scripts, inventories, and ancillary files related to the 2014fa_nata_cb6cmaq_14j cases, which serves as the basis for EPA's 2014v7.0 platform for air quality modeling for the National Air Toxics Assessment (NATA). CMAQ model-ready emissions generated using these packages with SMOKE v4.5 should be identical to those used in EPA's 2014v7.0 platform, with some exceptions as noted in this README. There may be additional emissions differences resulting from differences in the Linux operating system, hardware platform, and other system-specific differences. Additional information on the 2014v7.0 platform is available here: [**link**] The data files are divided into several directories: - 2014emissions contains emission inventories for the year 2014, including national CEMS emissions, nonroad inventories and onroad activity data, and point and nonpoint inventories - ancillary_inputs contains general ancillary files (ge_dat), including those related to speciation, spatial allocation (gridding), temporalization; and gridded ocean chlorine and volcanic mercury emissions files, which are included in the final model-ready emissions - smoke_2014v7_0_platform.zip contains scripts to run SMOKE and utility programs - spatial_surrogates contains 12km spatial surrogates for the US, Canada, and Mexico, and 4km spatial surrogates for the US Section 4 of this README includes information about the modeling sectors used in the platform. Section 5 of this README includes information about the inventories provided. Section 6 of this README includes information about the ancillary (non-inventory) files included. == 2. Requirements for processing emissions for air quality modeling == If you are only reviewing inventories and not developing emissions for air quality modeling, you do not need to install SMOKE or to follow the instructions below. Instead, unzip the files with the data of interest and examine those and the corresponding reports that are provided. If you plan to develop emissions for air quality modeling, SMOKE v4.5 is REQUIRED to process this case. SMOKE v4.5 includes additional updates and bug fixes as compared to versions 3.7 and 4.0, and so SMOKE v4.5 should be used when processing the 2014v7.0 set of emissions cases. Python: We also recommend (if not require) python version 2.6 or later, along with select python libraries. Many of the helper scripts included in this package use python. The python scripts within this package reference '#!/usr/bin/env python' or '#!/usr/bin/env python2.6'; you may need to change this on your computing platform. == 3. Installation of data files and scripts == This readme covers the installation of the SMOKE inventories, scripts, and ancillary files used for the 2014v7.0 platform. Choose an install directory on your system; we will refer to this directory as "INSTALL_DIR". To review/reproduce emissions for all sectors, unzip all the .zip files into INSTALL_DIR. The packages have subdirectories embedded within them, so it is important that all files be unpacked in the same place in order for the scripts know where to find the inputs. If you are only interested in reproducing or examining emissions for specific sectors, you may download and unzip only the data for those sectors from the 2014emissions directory, but for SMOKE processing you should also include the files in the ancillary_inputs and spatial_surrogates directories, and also the contents of smoke_2014v7_0_platform.zip. Precompiled SMOKE executables and I/O API utilities are available in the SMOKE zip. The SMOKE zip includes additional scripts and files that are not necessary to run SMOKE, but are for developing SMOKE inputs or processing SMOKE outputs. All SMOKE inventories, scripts, and ancillary files used for the 2014v7.0 platform are provided, except emission factor tables for onroad processing via SMOKE-MOVES. The full set of emission factor tables is too large to permanently store on the FTP server and instead are furnished upon request. MCIP meteorology data is not included in the package. This is used for the afdust, onroad, onroad_ca_adj, and biogenics (beis) sectors. See MET_ROOT instructions below. Prior to running SMOKE, you will need to edit the INSTALL_DIR (your install directory) and MET_ROOT (location of MCIP meteorology data) environment variable definitions in the "directory_definitions.csh" script located in each CASE/scripts directory. This script is sourced by each of the individual run scripts for each sector. Regarding the MCIP data, SMOKE only uses the GRIDCRO2D, METCRO2D, and METCRO3D files. As of 8 Jan 2017, the scripts have been updated so that you no longer need to edit the value of NAMEBREAK_HOURLY in the ptegu run script. == 4. Case description and instructions for each sector == 2014fa_nata_cb6cmaq_14j emissions were processed for a 12km national grid (12US2) and uses CB6 speciation for CMAQ Multi-Pollutant version 5.2. The CB6 for CMAQ mechanism is slightly different than the CB6 mechanism for the CAMx model from the 2011 emissions platform: it includes a new tracer species (SOAALK), naphthalene is now explicit in the mechanism (NAPH), and the XYL species is replaced by XYLMN (XYL minus naphthalene). The emissions from this package will also work with "base" CMAQ v5.2 in addition to the Multi-Pollutant version. CMAQ added support for the CB6 mechanism starting with CMAQ v5.1. Speciation profiles for CB05, SAPRC, and the CB6 for CMAQ mechanisms are posted on the FTP server under the 2011v6.2 platform and 2011v6.3 platform. Those GSPRO files do not include newer speciation profiles that debuted in the 2014 platform, however. GSPRO files including all new 2014 platform speciation profiles for these other mechanisms may be posted in the future. Furthermore, emissions for the 2014 platform have this far only been processed for CMAQ. This package does not include tools for converting the emissions to the format needed for CAMx modeling like in the 2011 platform. Emissions processing is split into "sectors". Each sector has its own run scripts for processing, with one (or more) run scripts per case. (See section 7 for information about the run script zips.) All sectors are US-only unless otherwise noted. The sectors are: AFDUST: Particulate emissions from fugitive dust sources. This sector is processed in two steps. The first (Annual_afdust_12US2_*) processes the annual inventory, and the second (Annual_afdust_adj_12US2*) applies adjustments - transportable fraction and meteorologically-based - and outputs the adjusted emissions under the sector name "afdust_adj". The afdust scripts must be run in that order. AG: Agricultural ammonia emissions. AGFIRE: Area source agricultural burning emissions for states that submitted their own agricultural burning emissions to the NEI. PTAGFIRE: Point source agricultural burning emissions for states that did not submit agricultural burning emissions to the NEI. Unlike the agfire sector, which is an annual inventory processed as an area sector, the ptagfire sector uses a daily point source inventory. This is a 'point' sector, and like all 'point' sectors, is processed via two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script must be run first. All emissions in this sector are low-level (no elevated / inline contribution). BEIS: Biogenic emissions generated using the BEIS model, version 3.6.1. CMV: Emissions from C1, C2, and C3 commercial marine sources, including ports and navigable waterways. Includes offshore C1/C2 marine emissions in the Atlantic and Pacific Oceans, and the Gulf of Mexico. Does NOT include offshore C3 marine emissions; those are in the othpt sector. NONPT: Area source emissions not included in other sectors. NONROAD: Off highway mobile source emissions. NP_OILGAS: Area source oil and gas emissions. ONROAD: On highway mobile source emissions, excluding California. This sector is processed using SMOKE-MOVES with multiple scripts as described in section 4B. ONROAD_CA_ADJ: On highway mobile source emissions, California only. This sector is processed using SMOKE-MOVES with multiple scripts as described in section 4B. OTHAFDUST: Particulate emissions from fugitive dust sources in Canada. Just like with afdust, this sector is processed in two steps. The first (Annual_othafdust_12US2_*) processes the annual inventory, and the second (Annual_othafdust_adj_12US2*) applies adjustments - transportable fraction and meteorologically-based - and outputs the adjusted emissions under the sector name "othafdust_adj". The othafdust scripts must be run in that order. Fugitive dust emissions in Mexico are included in the othar sector and do not need the same transportable fraction and meteorological adjustments that the Canada fugitive dust emissions in othafdust do. OTHAR: Area source emissions from Canada and Mexico, including mobile nonroad. ONROAD_CAN: Mobile onroad source emissions from Canada. ONROAD_MEX: Mobile onroad source emissions from Mexico. The onroad Mexico emissions inventory includes pre-speciated VOC emissions for the CB6-for-CAMx mechanism, so there is an extra script for this sector to convert those emissions to the CB6 mechanism needed for CMAQ. This extra script is called Monthly_onroad_mex_12US2_2014fa_nata_part2_combine.csh and uses the combine utility to perform the CB6-for-CMAQ conversion. The combine program is included, pre-compiled, in the SMOKE package along with pre-compiled SMOKE executables and I/O API utilities. To help make the distinction between CB6-for-CAMx and CB6-for-CMAQ emissions, CB6-for-CAMx emissions use the sector name "onroad_mex_cb6orig". The CB6-for-CMAQ post-processing step creates emissions files with the final sector name "onroad_mex". OTHPT: Point source emissions from Canada, Mexico, and offshore areas. Includes all offshore C3 commercial marine emissions. This is a 'point' sector, and like all 'point' sectors, is processed via two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script must be run first. All emissions in this sector are elevated (no low-level contribution). PTEGU: Electric generating unit emissions. This sector incorporates CEM (Continuous Emissions Monitoring) hourly emissions for a majority of sources. This is a 'point' sector, and like all 'point' sectors, is processed via a 'onetime' script first, followed by a 'daily' script. For ptegu there are two 'daily' scripts for different months of the year: 'summer' (May through September), and 'winter' (October through April). For sources without hourly CEM emissions, summer and winter use different hourly temporalization, and so they are run with separate inputs. All emissions in this sector are elevated (no low-level contribution). PTNONIPM: Point source emissions from industrial activities. This is a 'point' sector, and like all 'point' sectors, is processed via two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script must be run first. PTFIRE_F: Point source emissions from year specific controlled burning and wild fires, flaming emissions only. Flaming fires are processed in the 'inline' format for CMAQ, and are all elevated (no low-level contribution). This is a 'point' sector, and like all 'point' sectors, is processed via two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script must be run first. PTFIRE_S: Point source emissions from year specific controlled burning and wild fires, smoldering emissions only. Smoldering fires are forced into layer 1, and so for the ptfire_s sector, there is only a low-level gridded emissions file and no elevated / inline file. This is a 'point' sector, and like all 'point' sectors, is processed via two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script must be run first. PTFIRE_MXCA: Point source emissions from year specific controlled burning and wild fires in Canada and Mexico. Canadian fires are provided by Environment Canada in some months, and FINN (https://www2.acom.ucar.edu/modeling/finn-fire-inventory-ncar) in other months. Mexico fires are all from FINN. For this sector fires are processed in the 'inline' format for CMAQ, and are all elevated (no low-level contribution). This is a 'point' sector, and like all 'point' sectors, is processed via two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script must be run first. PT_OILGAS: Point source oil and gas emissions. This is a 'point' sector, and like all 'point' sectors, is processed via two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script must be run first. RAIL: Area source railway emissions. RWC: Area source residential wood combustion emissions. == 4B. Notes regarding onroad == Onroad emissions are processed using SMOKE-MOVES. The processing is split into multiple run scripts. In the 2014v7.0 platform, SMOKE-MOVES inputs were prepared using the latest version of MOVES: MOVES2014a. As in the 2011 platform, speciation of VOC emissions is handled within the MOVES model. At the time MOVES2014a was run for this emissions case, MOVES included the CB6 for CAMx mechanism, but not the CB6 for CMAQ mechanism needed for the 2014 NATA modeling effort. So, SMOKE-MOVES is run for the CB6-for-CAMx mechanism, and then converted to CB6-for-CMAQ in an additional post-processing step. To help make the distinction between CB6-for-CAMx and CB6-for-CMAQ emissions, CB6-for-CAMx emissions from SMOKE-MOVES use the sector names "onroad_cb6orig" and "onroad_ca_adj_cb6orig" instead of "onroad" and "onroad_ca_adj". The CB6-for-CMAQ post-processing step creates emissions files with the final sector names "onroad" and "onroad_ca_adj". As described in the SMOKE online documentation, SMOKE-MOVES handles onroad emissions separately for four types of processes: - On-network emissions (RatePerDistance, or RPD) - Off-network emissions, fuel vapor venting (RatePerProfile, or RPP) - Off-network emissions, extended idling (RatePerHour, or RPH) - Off-network emissions, non-venting, non-extended idle (RatePerVehicle, or RPV) For each of the two onroad sectors (onroad, onroad_ca_adj), there are separate run scripts for RPD, RPP, RPH, and RPV, plus a merge script that combines emissions from RPD, RPP, RPH, and RPV into a single emissions file per day, and a "part2" merge script that converts the emissions from CB6-for-CAMx to CB6-for-CMAQ. The CB6-for-CMAQ conversion is performed using the combine utility which is included, pre-compiled, in the SMOKE package along with pre-compiled SMOKE executables and I/O API utilities. These scripts may take a particularly long time to run, especially RPD. Therefore, consideration should be given to running multiple RPD jobs in parallel, such as one job per quarter. The reason onroad has been split into two sectors - onroad and onroad_ca_adj - is in order to match SMOKE-MOVES annual emission totals with those provided by California. To do this, we split California into a separate sector, and run SMOKE-MOVES with a control factor file (CFPRO) which nudges the emissions so that the annual totals post-SMOKE-MOVES equal those provided by CARB. CARB provided updated onroad emissions inventories for use in the 2014v7.0 platform, and we match their provided emissions at the county/SCC level, except that the CARB inventory does not distinguish different on-network road types (but does distinguish on-network and off-network emissions). For the onroad sector, a CFPRO is provided in the ancillary_inputs/ge_dat_for_2014fa_nata_other.zip package which serves two functions. First, the CFPRO zeroes out refueling emissions in 52 Colorado counties. We zero out these emissions in order to prevent a double count with the ptnonipm sector, since that sector includes refueling emissions in these counties. The second function of this CFPRO is related to the diesel PM species needed for NATA modeling, and zeroes out diesel PM emissions for all non-diesel SCCs. == 4B1. DAYS_PER_RUN == SMOKE-MOVES can be run more efficiently if running more than one day at a time. For example, Movesmrg can create one 7-day emissions file more quickly than it can create seven individual 1-day emissions files. To turn on this feature, use the DAYS_PER_RUN variable, set to the number of days you wish to run in a single Movesmrg instance. The recommended value for DAYS_PER_RUN is 7. The onroad scripts include a setting called "DAYS_PER_RUN", set to 1 as the default. If DAYS_PER_RUN > 1, after Movesmrg is run, the run scripts will use the I/O API utility m3xtract to split up the multi-day emissions file into single day (25-hour) emissions files. Multi-day Movesmrg runs will never cross months. For example, if DAYS_PER_RUN = 7, then the last Movesmrg run of January will start on January 29th and end on January 31st (3 days), and the first Movesmrg run of February will start on February 1st and end on February 7th. Using the multi-day Movesmrg functionality requires multi-day MCIP files. For example, if DAYS_PER_RUN = 7, your METCRO2D files must also be 7 days (169 hours) long. These multi-day MCIP files should be stored in MET_ROOT/../[]_Xday/, where X = DAYS_PER_RUN (i.e. /7day for DAYS_PER_RUN = 7), and [] can be any string. For example, if the single day MCIP files are in /foo/foo/mcip_dir/, then 7-day MCIP files should be in /foo/foo/mcip_dir_7day/. The primary drawback to using this multi-day Movesmrg functionality is an increase in the memory usage. == 4C. Sector merge == After all sectors have been processed, the Sector_merge script merges the low-level emissions from all sectors into a single CMAQ-ready emissions file per day. Merged model-ready emissions will be output to: INSTALL_DIR/$CASE/smoke_out/$CASE/12US2/$SPC/ Inline emissions and stack_groups files will be output to the same directory, except in subdirectories by sector name (e.g. .../$SPC/ptnonipm/). == 5. Description of inventory packages == Inventories for 2011ek are included in the following files, all of which should be unpackaged in INSTALL_DIR: 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_biogenics.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_cem.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_nonpoint.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_nonroad_part1.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_nonroad_part2.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_nonroad_part3.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_nonroad_part4.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_onroad.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_oth_part1.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_oth_part2.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_point.zip 2014emissions/2014fa_nata_cb6cmaq_14j_inputs_ptfire.zip The "biogenics" package includes gridded land use, BIOSEASON, and biogenic emission factor files for input to BEIS 3.6.1. The latest version of BEIS uses different land use and emission factor inputs compared to prior versions, such as BEIS 3.14. The 2014v7.0 platform includes updated land use: BELD 4.1. This package also includes a b3grd file for 12US2. This file is output by the Normbeis program, and is included in the package for those people who do not wish to run Normbeis and generate their own b3grd using the provided land use and emission factor files. This b3grd file is not speciation-specific and can be used with any speciation mechanism. The "cem" package includes the hourly CEM (Continuous Emissions Monitoring) emissions used by the ptegu sector. This is the same data as is available on EPA's Air Markets Program Data website (ampd.epa.gov), except that we've split the data into months and days as needed for our scripts. The "inputs_nonpoint" package includes inventories for the following sectors: afdust, ag, agfire, cmv, nonpt, np_oilgas, rail, rwc. The four "inputs_nonroad" packages include the inventories for the nonroad sector. It was split into four .zips in order to reduce the size of each individual .zip. The following states are included in each .zip: - part1: all states from Alabama to Iowa, alphabetically (FIPS 01 through 19) - part2: all states from Kansas to Missouri, alphabetically (FIPS 20 through 29) - part3: Montana through Ohio, and Vermont through Wyoming (FIPS 30-39 and 50-56) - part4: Oklahoma through Utah, Puerto Rico, U.S. Virgin Islands (FIPS 40-49, 72-78) The "inputs_onroad" package includes the activity data for the onroad and onroad_ca_adj sectors. It does not include the emission factor tables also required to run SMOKE-MOVES; these are not stored on the FTP site due to their large size and instead are furnished upon request. Emission factors for CB6 are available. The two "inputs_oth" packages includes the inventories for Canada, Mexico, and offshore emissions, except fires. These inventories were split into two .zips in order to reduce the size of each individual .zip. Part1 includes inventories used by the othafdust, othar, onroad_can, and othpt sectors, and also part of the onroad_mex inventory. Part2 includes the rest of the onroad_mex inventory. The "inputs_point" package includes the inventories for the following sectors: ptnonipm, ptegu, pt_oilgas, ptagfire. The ptnonipm inventory included in this package is not exactly identical to the one used for EPA's 2014 NATA version 1 air quality modeling. The packaged inventory includes corrections to stack parameters for some coke oven facilities. The result of this correction is that the associated coke oven emissions are more correctly classified as "elevated" sources and thus subject to plume rise in CMAQ, instead of being forced into Layer 1. The "inputs_ptfire" package includes the inventories for the ptfire_f, ptfire_s, and ptfire_mxca sectors. The "inputs" packages also include sample list (.lst) files, which list the inventories used by each sector. The run scripts automatically create new .lst files with the correct paths, so there is no need to edit these unless you are running SMOKE with your own scripts. See section 4 for a description of each modeling sector. == 6. Description of ancillary file packages == The following packages should be unpacked in INSTALL_DIR: ancillary_inputs/ge_dat_for_2014fa_nata_gridding.zip ancillary_inputs/ge_dat_for_2014fa_nata_other.zip ancillary_inputs/ge_dat_for_2014fa_nata_speciation.zip ancillary_inputs/ge_dat_for_2014fa_nata_temporal.zip ancillary_inputs/ocean_chlorine.zip ancillary_inputs/volcanic_mercury.zip spatial_surrogates/surrogates_CONUS12_2010_CAN_MEX.zip spatial_surrogates/surrogates_CONUS12_2014_v1_10oct2016.zip The "surrogates" packages contain the gridding surrogates for 12US2, for US and Canada+Mexico. Surrogates for 4km are also provided for the US only in the spatial_surrogates directory. The "gridding" package includes all SMOKE inputs related to gridding other than the spatial surrogates, including cross-references, surrogate descriptions, and gridded transportable fractions used in afdust_adj and othafdust_adj. The "speciation" package includes speciation profiles, cross-references, and VOC-to-TOG conversion factors. This .zip includes files for the CB6 mechanism for CMAQ v5.2. The "temporal" package includes temporal profiles and cross-references, including daily and hourly temporal profiles developed by the SMOKE program Gentpro for use in the rwc and ag sectors, respectively. The "other" ge_dat package includes all other SMOKE ancillary files not included in the above packages, including: - Inventory tables (INVTABLE) - Lists of sources to exclude from CAP/HAP integration (NHAPEXCLUDE) - SMOKE-MOVES ancillary files, including the reference county (MCXREF) and fuel month (MFMREF) cross-references, pollutant (MEPROC) and emission factor table (MRCLIST) lists, activity SCC to full SCC cross-references (SCCXREF), hourly speed profiles (SPDPRO), daily temperature data (METMOVES), and control factors for onroad_ca_adj and Colorado refueling (CFPRO) - Other miscellaneous SMOKE inputs, such as the ARTOPNT, COSTCY, HOLIDAYS, MACTDESC, NAICSDESC, ORISDESC, PELVCONFIG, PSTK, SCCDESC - Smkreport configuration files (REPCONFIG, all in ge_dat/repconfig/default) This ocean_chlorine.zip package contains gridded ocean chlorine emissions, which are included in the sector merge. Unzip in INSTALL_DIR. This volcanic_mercury.zip package contains gridded mercury emissions from volcanoes, which are included in the sector merge. Unzip in INSTALL_DIR. This file only includes mercury and is only needed for multi-pollutant modeling. The run scripts (see section 7) are already set up to use the proper ancillary files and inventories for each sector and case. == 7. Description of script packages == The smoke_2014v7_0_platform.zip package should be unpacked in INSTALL_DIR. This includes scripts and precompiled executables for running SMOKE in general, and for running the 2014fa_nata emissions modeling cases in particular. The scripts in the 2011ek_cb6v2_v6_11g, 2017ek_cb6v2_v6_11g, 2017ek_cntl_cb6v2_v6_11g, and 2017ek_ussa_cb6v2_v6_11g subdirectories are the scripts you run directly in order to replicate our emissions. Separate script(s) are provided for each sector. See section 4 for information pertinent to each sector. In general, you edit the directory_definitions.csh file, in particular INSTALL_DIR and MET_ROOT, and then run each sector. Sector scripts are organized into subdirectories within CASE/scripts by sector category: biogenics, nonpoint, onroad, point, and merge. For afdust sectors, run afdust/othafdust first, then afdust_adj/othafdust_adj. For point sectors, run "onetime" first, and then "daily". For onroad sectors, run RPV/RPD/RPH/RPP first, then the merge. The scripts, programs, and other inputs in the following subdirectories are all "helper" scripts and inputs, and generally never need to be run directly: camx, edss_tools, filesetapi, ioapi, scripts, smoke3.7 The other subdirectories include additional scripts that are not part of the SMOKE "core". These are covered in Section 7B. == 7B. Other miscellaneous programs == The smoke_2014v7_0_platform.zip package also includes the following scripts for various pre-SMOKE and post-SMOKE applications: movesmrg_report_postproc/: Onroad processing with SMOKE-MOVES does not use an input emissions inventory with county/SCC-level emissions already defined. Instead, we need SMOKE-MOVES outputs to serve as the emissions "inventory" for report and summary purposes. Therefore, Movesmrg is set up to create full county/SCC-level daily emissions reports. A script to aggregate the Movesmrg daily reports to more digestable annual and monthly reports by state, state/SCC, county, and county/SCC, are provided here. The 2011v6.3 package also includes several SMOKE tools: ftp://ftp.epa.gov/EmisInventory/2011v6/v3platform/README_2011v6.3_package.txt (see Section 7B) ftp://ftp.epa.gov/EmisInventory/2011v6/v3platform/smoke_2011v6_3_platform.zip