4.2 Overview of Input Data Sources for 2023
EPA received MOVES county database (CDB) submittals (1,425 databases) from S/L/T agencies and 2023 vehicle registration data from S&P Global Mobility (SPGM), which EPA adapted to compute vehicle populations, vehicle age distributions, and fuel type fractions. FHWA provided vehicle-miles traveled (VMT) data by county and road type. EPA downloaded 2022 Travel Monitoring Analysis System (TMAS) traffic volume data, which EPA transformed into VMT distributions by month. EPA also received 2022 and 2023 vehicle telematics data from StreetLight Data, Inc. (StreetLight), which EPA transformed into MOVES- and SMOKE-ready input files describing the distributions of vehicle speeds and fractions of VMT by hour and day of week. The S/L/T CDBs, for 2023 along with the vehicle registration data, informed an analysis to identify counties with similar fleet characteristics to create representative county groups. Like the 2020 NEI, age distributions for representative counties are a population-weighted average of the member county age distributions. The 2023-specific vehicle speed and VMT distributions were used directly in SMOKE at the individual county level; therefore, these data are not considered for the representative county selection for MOVES runs. The CDBs and representative county groups are discussed in Sections 4.5 and 4.6.2.1, respectively.
4.2.1 EPA Default 2023 Vehicle Populations, VMT, Age Distributions, and Fuel Type Mix
In areas where there is no acceptable S/L/T data available, the 2023 NEI onroad sector is based on 2023 vehicle population data from SPGM and 2023 VMT data from FHWA. To develop 2023 vehicle population data, EPA purchased a snapshot of vehicles in operation across the nation as of July 1, 2023, from SPGM. EPA processed the vehicle registration summary to develop inputs for both MOVES (i.e., the CDB tables for vehicle population, age distributions, and fuel type fractions) and SMOKE (vehicle populations by Source Classification Code (SCC) and county). SPGM receives the registration records from each state’s Department of Motor Vehicles (DMV) and decodes vehicle identification numbers (VINs) to assign each vehicle a MOVES source type code. The database SPGM provided to EPA did not identify individual vehicles but rather was a summary of the population in each county by parameters including vehicle make, model, model year, gross vehicle weight (GVW) class, and other descriptive information. EPA performed data cleanup on the SPGM’s MOVES source type assignments of the vehicle populations by reclassifying improperly categorized vehicles as follows:
- Combination Unit Trucks changed to Single-Unit, where body style was “STRAIGHT TRUCK.”
- Single-Unit Trucks changed to Other Bus, where the body style was “BUS NON SCHOOL.”
- Single-Unit Trucks changed to Combination Unit, where the body style was “TRACTOR TRUCK” and the gross vehicle weight rating (GWVR) was a Class 6 or higher.
- Source types 41 (Other Bus) and 42 (Transit Bus) were reclassified/reapportioned as the original distinctions for these two were not meaningful.
- Several Light-Duty Truck source type designations Passenger Car (source type 21) where the make/models were any of the following: Acura ZDX, Buick Encore, Chrysler PT Cruiser, Honda Element, Kia Soul, Nissan Juke, Suzuki XL7, or Toyota Scion XB.
- Vehicles with missing model year data were removed.
- Vehicles with missing fuel type data were assigned to the most-common fuel type for the class and model year.
- Vehicles with fuel types presenting combinations that cannot be modeled in MOVES were updated as follows:
- CNG-fueled Light-Duty source types were reassigned to Diesel
- E85-fueled Heavy-Duty source types were reassigned to Gasoline
- Non-gasoline fueled Motorcycles were reassigned to Gasoline
- Fuel cell (fuel type 9, engine technology ID 40) Light-Duty source types were reassigned to Battery Electric (fuel type 9, engine technology type 30)
- EPA compared total school bus counts to another data source (School Bus Fleet Fact Book) and the totals compared well. Based on the data tracking well with this source, EPA elected not to reassign vehicles with a body style of “BUS SCHOOL” to the MOVES source type for school buses (source type 43) based on the data check and the known fact that some school buses are used for work-day transportation (e.g., agricultural work) and these uses do not fit the MOVES definition of source type 43 (vehicles used for pupil transportation).
- EPA reapportioned Short-Haul vs. Long-Haul fractions for the Single Unit Truck (source types 52 vs. 53) and Combination Unit Trucks (source type 61 vs. 62) according to Table 4.2, because this distinction is based on usage patterns and is not related to registration data. The approach for Single Unit trucks was unchanged from the 2020 NEI and earlier. The approach for Combination Unit trucks was updated for the 2023 NEI based on the 2021 federal Vehicle Inventory and Use Survey (VIUS2021) based on the data value for the field “CABDAY,” where a value of “Day Cab” was interpreted as short-haul and “Sleeper Cab” was long-haul.
| Truck Type | Region Name | Fraction Short haul | Fraction Long haul |
|---|---|---|---|
| Single-Unit (A) Source Types 52/53 | Midwest | 0.807 | 0.193 |
| Single-Unit (A) Source Types 52/53 | Northeast | 0.919 | 0.081 |
| Single-Unit (A) Source Types 52/53 | South | 0.860 | 0.140 |
| Single-Unit (A) Source Types 52/53 | West | 0.882 | 0.118 |
| Combination (B) Source Types 61/62 | Individual State-level (C) (National averages shown) | 0.523 (National Average) | 0.477 (National Average) |
- Single-Unit fractions were the same as 2020 NEI and earlier (based on Freight Analysis Framework)
- Combination Unit fractions relied on state-level VIUS2021 data, new for 2023 NEI.
- State level Combination Truck fractions of short/long haul may be found in VIUS2021_state_cabday.csv.
Following the data cleanup steps to reassign MOVES source types, EPA assigned fuel type IDs and engine technology IDs according to SPGM’s fuel type records as shown in Table 4.3.
| SPGM Fuel Type | fuelTypeID | engTechID |
|---|---|---|
| GAS, GASOLINE, ELECTRIC AND GAS HYBRID, G | 1 | 1 |
| DIESEL, ELECTRIC AND DIESEL HYBRID, D | 2 | 1 |
| COMPRESSED NATURAL GAS, LIQUID NATURAL GAS | 3 | 1 |
| FLEXIBLE, ETHANOL | 5 | 1 |
| ELECTRIC | 9 | 30 |
| HYDROGEN FUEL CELL | 9 | 40 |
EPA also assigned a regulatory class ID on the basis of mostly SPGM’s reported GVWR. EPA assigned all motorcycles to regClassID 10, all passenger cars to regClassID 21, and all light-duty trucks to regClassID30 if they either (1) had a GVWR of 1, 1a, 2, 2a, blank, or (2) a fuel type of E-85 (fuelTypeID 5). EPA assigned heavy-duty regulatory class IDs according to Table 4.4.
| sourceTypeID | fuelTypeID | GVWR | Assigned regClassID |
|---|---|---|---|
| 54 | Any | 1, 2, 3 | 41 |
| 54 | Any | 4, 5 | 42 |
| 54 | Any | 6, 7 | 46 |
| 54 | Any | 8, 9 | 47 |
| 41, 42, 43, 51, 52 | Any | 3 | 41 |
| 41, 42, 43, 51, 52 | Any | 4, 5 | 42 |
| 41, 42, 43, 51, 52, 61, 62 | Any | 6, 7 | 46 |
| 41, 43, 51, 52, 61, 62 | Any | 8 | 47 |
| 42 | 1 | 8 | 47 |
| 42 | 2, 3, 9 | 8 | 48 |
After assigning fuelTypeID, engTechID, and regClassID, EPA transformed the data into MOVES-ready inputs of individual county level SourceTypeYear, SourceTypeAgeDistribution (in MOVES5 format, covering ageID 0 to 40), and AVFT (Alternative Vehicle and Fuel Technology). EPA compared state-level population totals to the FHWA MV-1 table and state submittals and made further adjustments to the SPGM populations for Motorcycles and Light-Duty Vehicles.
Of the 30 S/L/T agencies that participated in the data submittal process, 24 of these provided both LDV populations (MOVES ‘SourceTypeYear’ table) and age distributions (MOVES ‘SourceTypeAgeDistribution’ table) based on 2023 registration data, which was a requirement for comparison with the 2023 SPGM data. Other agencies were excluded from the adjustment factor analysis because they provided only one type of local data (e.g., population but no age distribution) or data with outdated (e.g., year 2020) or unknown registration data draw dates. For the 24 areas that could be included in the analysis, EPA first combined the populations of passenger cars (source type 21) and light-duty trucks (source types 31 and 32) at the county level to remove the uncertainty of VIN decoding personal passenger vehicles as cars vs. light-duty trucks. EPA then allocated each county’s LDV total source type population to vehicle model years for comparison with SPGM and found that the SPGM populations for 2023 were higher than the state data by 10.7 percent. Similar to prior years’ comparisons, EPA again found that the discrepancies in the 2023 data between SPGM and states are larger for older vehicles. Table 4.5 shows the LDV adjustments
EPA made to the 2023 SPGM data prior to its use in the NEI. EPA calculated the adjustment factors representing the fraction of population remaining in every model year, with two exceptions. The model year range from 2015 to 2023 received no adjustment and the model years 1993 and earlier received a capped adjustment that equals the adjustment for model year 1994. The adjustment factors in Table 4.5 were applied to the 2023 SPGM data to create the EPA Default set of population and age distributions for the NEI.
| Model Year | LDV Adjustment Factor |
|---|---|
| pre-1994 | 0.703 |
| 1994 | 0.703 |
| 1995 | 0.725 |
| 1996 | 0.755 |
| 1997 | 0.751 |
| 1998 | 0.779 |
| 1999 | 0.789 |
| 2000 | 0.792 |
| 2001 | 0.804 |
| 2002 | 0.811 |
| 2003 | 0.818 |
| 2004 | 0.823 |
| 2005 | 0.836 |
| 2006 | 0.850 |
| 2007 | 0.854 |
| 2008 | 0.856 |
| 2009 | 0.906 |
| 2010 | 0.892 |
| 2011 | 0.909 |
| 2012 | 0.919 |
| 2013 | 0.919 |
| 2014 | 0.937 |
| 2015 - 2023 | 1 |
EPA also compared motorcycles (MC) between SPGM and pooled submitted populations from 24 agencies and found that the agency submittals were approximately half (54 percent) the population reported by SPGM for the group of counties. EPA then compared the SPGM MC population summarized by state to the FHWA’s MV-1 table for all states except Idaho and New York because these had data quality problems in MV-1 at the time of analysis (early June 2025). Like the pooled NEI submittals, MV-1 table also showed lower MC populations than SPGM. The national total (excluding ID and NY) MC population from MV-1 values was just 63 percent of the SPGM values which would be a 37 percent reduction if EPA had used a national adjustment to correct motorcycles in the SPGM set. Instead, EPA opted to use the state resolution from MV-1 to correct SPGM MC populations by state, and these values ranged from zero (0) to 60 percent reductions by state. EPA opted to use MV-1 for the correction not only due to complete data for all states but also because the comparison by model year to the 24 agencies showed no trend in the differences by age. EPA attempted to compare MV-1 populations to SPGM for heavy-duty vehicles (HDVs), but it was not possible to remove light-duty trucks from the MV-1 “Truck” category, and therefore no comparison was possible. States without any CDB submittals received EPA Default populations and age distributions based on the adjusted SPGM data, and some states with submittals were overridden, decided on a case-by-case basis. Section 4.3 lists the submitted data that was accepted vs. replaced with EPA age distribution data for the 2023 NEI.
In addition to removing the MCs and older LDVs from the SPGM data, EPA also removed outlier age distributions that showed excessively “new” fleets. In the past, this situation where the registration data reflects a young fleet occurs when the headquarters of a leasing or rental company owns a large fraction of the vehicles in the county. We dealt with these cases by preferentially excluding them from the representative county calculation of age distribution. For counties that were the only county in the representative county group, we made a substitution with an age distribution for the same source type from another county in the same metropolitan statistical area (MSA). EPA believes that these new vehicles do not represent the county’s operating vehicle fleet, and the clean-up step avoids regions of artificially low LDV emissions in the NEI. The list of counties receiving this substitution was the same as it was for the 2022v2 platform and is shown in Table 4.6.
| County ID | County Name, State | 2023 NEI Substitution |
|---|---|---|
| 8035 | Douglas County, CO | Substitute source types 21 and 32 age distribution from county 8031 from the same MSA (Denver-Aurora-Lakewood, CO) |
| 40109 | Oklahoma County, OK | Substitute source type 21 age distribution from the county 40027 from the same MSA (Oklahoma City, OK) |
| 40143 | Tulsa County, OK | Substitute source types 21 and 32 age distributions from the county 40131 from the same MSA (Tulsa, OK) |
In some states where submitted vehicle population data were accepted for NEI, the relative populations of cars vs. light-duty trucks were reapportioned (while retaining the magnitude of the light-duty vehicles from the submittals) using the county-specific percentages from the SPGM data. In this way, the categorization of cars versus light trucks is consistent with EPA Default methods and avoids incorrectly categorizing the light-duty fleet as mostly passenger cars. The county total light-duty vehicle populations were preserved through this process. The S/L agencies receiving reapportioned cars vs. light-duty trucks included TX, NH, PA, and RI.
4.2.2 EPA Default 2023 Month VMT Distributions
In areas where S/L/T data were not available, month VMT distributions were previously derived from Travel Monitoring Analysis System (TMAS) traffic volume data for the 2022v1 platform and carried over to both the 2022v2 platform and the 2023 NEI. EPA downloaded the TMAS dataset for 2022 from the website https://geodata.bts.gov/datasets/travel-monitoring-analysis-system-class/about. The data contains station level, hourly traffic volume counts with detail available by the 13 FHWA vehicle classes, which EPA mapped to MOVES source type as described in 2022v1 documentation. TMAS data analysis removed incomplete stations as needed, and traffic volumes were transformed into state-level month VMT fractions.
4.2.3 EPA Default 2023 Vehicle Speeds and Hour/Day VMT Distributions
EPA purchased county-level telematics data from StreetLight for characterization of vehicle speed profiles and VMT temporal distributions for 2022 and some months of 2023 (January through May) available at the time of purchase. EPA purchased this data because the prior data from StreetLight was specific to 2020 and reflected strong variation in monthly speeds that tracked with stay-at-home orders. The 2022 and 2023 data reflect the new normal patterns of VMT and hourly distributions of speeds by road type and county across the US. EPA transformed the summaries of vehicle distance, time, and speed to create temporal profiles of VMT and speed distributions by road type by month, day of week, and hour. Vehicle types included personal, commercial medium-duty, and commercial heavy-duty. While EPA received and examined the data by month, it was observed that the distributions did not significantly vary by month and therefore EPA opted to prepare annual average speed distributions by hour, day type, source type, and road type, which was standard practice before the 2020 NEI.
StreetLight data sources have evolved over time but currently uses connected vehicle data for personal vehicles and in-vehicle Global Positioning System (GPS) data for medium- and heavy-duty commercial trucks. These data are aggregated such that personal information is not revealed. StreetLight performs data processing in-house to pin the location and time data to a roadway network to calculate summaries of travel distance, travel time, and speed.
In areas where S/L/T data were not available or of lower quality/resolution than StreetLight, EPA used telematics-based data for the hour and day VMT fractions as well as the speed distributions available from StreetLight. Because the StreetLight dataset did not cover Alaska, Hawaii, the U.S. Virgin Islands, or Puerto Rico, EPA made substitutions using data from other areas. EPA assigned statewide averages for Montana to Alaska, and national averages of StreetLight to Hawaii, U.S. Virgin Islands, and Puerto Rico. In addition to the substitutions for the outlying areas, EPA performed other gap filling described in Section 4.5.4.