Reverse FF10 scripts
README

The activityUpdates.py file is a Python script that calls MySQL scripts to load VMT, population, and hoteling data from FF10 files and use them to populate NEI CDBs. The three MySQL scripts that the script calls are Load_FF10_datasets.sql, Populate_CDBs_from_FF10.sql, and hotellinghours.sql. This script also performs some basic QA of the updated CDBs to ensure that the data were loaded correctly.

To run this script, Python and the libraries referred to at the top of the script need to be installed on the machine. The os, re, and glob libraries are built-in Python libraries, while the others would need to be obtained and installed, e.g., using pip install. The mysql.connector library is provided directly by Oracle (from https://www.mysql.com/products/connector/ at the time of writing).

This script was designed to run on a *nix platform, such as Linux or macOS, so it would probably need to be modified slightly to work on a Windows machine. Specifically, it is designed to work with CDB folders stored on an external hard drive, relying on symbolic links to point the MySQL server to the folders. Symbolic links work differently in Windows, so the code associated with this would likely need to be modified or, alternatively, removed to work with folders stored directly in the MySQL data folder.

In addition, this script was written to work with a CDB naming convention which includes the modification date at the end of the database folder name, e.g., c01001y2014_20160620. If this convention is not followed, the part of the script that recognizes and updates the date stamp would need to be removed or
modified.

The steps to run the script are:

1) Update the paths in Load_FF10_datasets.sql to point to the FF10 files that contain the data to be imported.
2) Place the CDBs to be updated in their own folder external to the MySQL data directory (assuming that the code hasn't been modified to work with data stored directly in the MySQL directory).
3) Update the paths in the "Constants" section of  activityUpdates.py to match the locations of the MySQL scripts, the hoteling FF10 file (used for QA purposes), the folder containing the CDBs to be updated, and the MySQL data directory. If desired, update the prefix used to name the QA output file (summaryStatsPrefix).
4) Verify that the MySQL username and password in the "Constants" section of activityUpdates.py are correct.
5) When this script was originally run, it was expected that each CDB would contain 138 files. However, in the future the expected number may be different, in which case the numFiles!=138 on line 197 of activityUpdates.py should be changed. (This is optional, as it's just for QA purposes, and doesn't affect the creation of the CDBs.)
6) Run the script, e.g., using "python activityUpdates.py" from the command line or by launching from an IPython console.
7) Review the *_Merged.csv QA file to ensure that the values in the CDBs and FF10 files match as they are expected to.


