Cleaning NAIC Data from SNL

I work extensively with data from the National Association of Insurance Commissioners (NAIC) and this data is provided via a subscription to SNL Financial. If you also use the data from SNL, you are likely familiar with the struggles of making this data ready for analysis. I have written a series of Excel Add-Ins that automate the data collection, manipulation, and formatting over multiple years and exports the data into CSV format.

A ZIP file is linked on the right. This file includes:

  • User Guide: how to install and use the Add-Ins

  • NAIC Cleaner: the macro file that executes all the cleaning

  • naic_cleaner_annual: a Stata .do file that allows you to append annual financials into "long" format

  • naic_cleaner_quarterly: a Stata .do file that allows you to append quarterly financials into "long" format

Begin with the User Guide, which should guide you through the installation and use of each of the other files. I won't be providing any tech support, but if you discover bugs and/or manage to improve the code, please share with me!

DOWNLOAD NAIC DATA CLEANING ZIP FILE HERE

Current version: 3.1 (September 2020)

  • Adjusted data collection form to automatically update years (no need to manually update macros each time the new NAIC data comes out).

  • Added a custom toolbar which should appear in the "Add-ins" tab in the ribbon.

  • Abandoned the "Append data" (within Excel) macro. CSV files can be appended with the included Stata code.

  • Improvements and bug fixes for auto-renaming of variables.

Previous versions:

3.0 (April 2019)

  • Added 2017 and 2018 to data collection form.

  • Added an option not to automatically rename variables in the data collection form.

  • Wrote code identifying when two variables are given the same name, and asks the user to choose a new variable name.

2.4 (September 2017)

  • Added 2016 to data collection form.

  • Automatically saves the file before collecting the additional years.

  • Added automatic variable naming, by shortening and concatenating the variable names and details from SNL.

  • Creates a "Codebook" to map auto-generated variable names to the original SNL naming convention (saved in a folder the user specifies).

  • Amended Stata annual code to delete temporary .dta files (have not done this for quarterly).

2.3 (November 2016)

  • Added macro to reset years - if an error occurs the global variables are lost. Click the "Reset Years" button to set these global variables again.

  • Deleted custom ribbon from v2.1 as it was mapped to my file structure. Added instructions for creating your own custom ribbon to README file. Highly recommended that you do this.

  • Added Stata .do files to append both annual and quarterly data from CSV files in Stata.

  • Append bug from 2.2 remains.

2.2 (March 2016)

  • Included version tracking.

  • Improved appending capability (previously would only append up to ~32k rows in Master sheet).

  • Bug: Encounters 'Error 1004' when appending large datasets. The append still occurs properly, there's just an error before the copied sheets can be deleted. Manually delete all sheets except "Master" and then run the "Save as CSV" with only the Master sheet in the workbook.

2.1 (February 2016)

  • Added user form with pull-down menus for years to collect.

  • Added ability to collect quarterly data.

  • Added ability to append years in a "Master" sheet.

  • Added optional custom ribbon.

1.x (2015 and earlier)

  • Annual data only.

  • Enter start and end years via Message Box prompts.

  • No option to append years in Excel - must append using SAS, Stata, etc.