As of August 21, 2018, the dataset created for this project consists of 9289 records of degrees conferred by the University of New Mexico between 1912 and 1953. This time period was selected because lists of degrees conferred were available for these years from the UNM course catalogs and Board of Regents meeting minutes. The course catalog scans were processed using PDFMiner, Notepad ++, and MS Excel. Degree recipients' sex was coded based on first and middle names, and the presence of suffixes (Jr., II, III etc.); if the names were ambiguous or unknown, the sex was coded as "U."
In the future we hope to expand the dataset by adding data for other years.
The dataset can be downloaded from UNM's Digital Repository and may be used by other researchers.
One question that many people have asked about this project is, what about data on race and ethnicity for UNM graduates?
Unfortunately, the course catalogs used as the primary source for our dataset did not contain this information in a form that could be easily converted into current terms. UNM's Office of Institutional Analytics has interactive data visualizations for degrees conferred from 2013 to 2017 (as well as statistics on enrollment, financial aid, and faculty & staff) on their web page.
Available in the course catalogs in most of the years covered by our dataset are two tables of aggregate geographical data for currently enrolled students; Table A shows the distribution of students from US states and foreign countries in 1938, and Table B shows the distribution of students by New Mexico county. There is also an enrollment summary broken out by sex.
Potential sources of error in the data:
1. The original lists of degree recipients were manually typed by various individuals beginning over 100 years ago, and their accuracy was not verified.
2. The sex of degree recipients was coded for this project based on first and middle names. These judgments could be in error in an unknown percentage of cases.
3. The optical character recognition of PDFs likely impacts the recorded spelling of recipient names, particularly surnames, in an unknown percentage of cases.
4. Mining of PDF data. Due to the script's interpretation of text block placement on pages, majors or minors could have been assigned to the wrong individual in an unknown percentage of cases.
Every effort was made to check for and correct these errors as far as possible. If you find data errors, please let us know and we will be happy to correct them and update our materials to reflect the changes.