Summary: In this notebook, we create the BUNMD file by unifying the cleaned and "condensed" Numident (deaths, applications, and claims) into a single file.
The function unify_numident
combines the information from the deaths, applications, and claims files into a file with one record per person: the Berkeley Unified Numident Mortality Database (BUNMD).
The functions "create_weights" create post-stratification weights to the HMD for the BUNMD Sample 1 and Sample 2.
library(censocdev) library(ipumsr)
## read in cleaned and "condensed" numident files claims <- fread("/censoc/data/numident/3_numident_files_cleaned_and_condensed/claims_condensed.csv") deaths <- fread("/censoc/data/numident/3_numident_files_cleaned_and_condensed/deaths_cleaned.csv") applications <- fread("/censoc/data/numident/3_numident_files_cleaned_and_condensed/apps_condensed.csv") ## combine records into one file bunmd <- create_bunmd(claims = claims, applications = applications, deaths = deaths) ## construct weights for BUNMD bunmd <- create_weights_bunmd(bunmd) bunmd <- create_weights_bunmd_complete(bunmd)
## create string variables ddi_extract <- read_ipums_ddi("/data/josh/CenSoc/censoc_data/ipums_1940_extract/fullcount.ddi.xml") ## extract geo codes geo_codes <- ipums_val_labels(ddi_extract, BPLD) ## join geocodes for birthplace bunmd <- bunmd %>% left_join(geo_codes %>% select(bpl = val, bpl_string = lbl)) %>% left_join(geo_codes %>% select(socstate = val, socstate_string = lbl)) ## write out BUNMD file fwrite(bunmd, "/censoc/data/numident/4_berkeley_unified_mortality_database/bunmd.csv")
Recap of the decision rules we used to create the BUNMD:
sex: When available, select the person’s sex from their death record. If unavailable in their death record, select sex from their most recent application record. If unavailable in their death record and their application records, select sex from their most recent claim record.
race_first: Select the person’s race_first from their most first application record.
race_last: Select the person’s race_last from their last recent application record.
bpl: Select the person’s race from their most recent application record. If unavailable in their application records, select from their most recent claim record.
father_fname: Select the person’s father’s first name that is the maximum number of characters across applications.
father_mname: Select the person’s father’s middle name that is the maximum number of characters across applications.
father_lname: Select the person’s father’s last name that is the maximum number of characters across applications.
mother_fname: Select the person’s mother’s first name that is the maximum number of characters across applications.
mother_mname: Select the person’s mother’s middle name that is the maximum number of characters across applications.
mother_lname: Select the person’s mother’s last name that is the maximum number of characters across applications.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.