Summary: In this notebook, we create the BUNMD file by unifying the cleaned and "condensed" Numident (deaths, applications, and claims) into a single file.

The function unify_numident combines the information from the deaths, applications, and claims files into a file with one record per person: the Berkeley Unified Numident Mortality Database (BUNMD).

The functions "create_weights" create post-stratification weights to the HMD for the BUNMD Sample 1 and Sample 2.

library(censocdev)
library(ipumsr)
## read in cleaned and "condensed" numident files
claims <- fread("/censoc/data/numident/3_numident_files_cleaned_and_condensed/claims_condensed.csv")
deaths <- fread("/censoc/data/numident/3_numident_files_cleaned_and_condensed/deaths_cleaned.csv")
applications <- fread("/censoc/data/numident/3_numident_files_cleaned_and_condensed/apps_condensed.csv")

## combine records into one file
bunmd <- create_bunmd(claims = claims, applications = applications, deaths = deaths)

## construct weights for BUNMD 
bunmd <- create_weights_bunmd(bunmd)
bunmd <- create_weights_bunmd_complete(bunmd)
## create string variables 
ddi_extract <- read_ipums_ddi("/data/josh/CenSoc/censoc_data/ipums_1940_extract/fullcount.ddi.xml")

## extract geo codes
geo_codes <- ipums_val_labels(ddi_extract, BPLD) 

## join geocodes for birthplace 
bunmd <- bunmd %>% 
  left_join(geo_codes %>% 
               select(bpl = val, bpl_string = lbl)) %>% 
  left_join(geo_codes %>% 
               select(socstate = val, socstate_string = lbl))

## write out BUNMD file
fwrite(bunmd, "/censoc/data/numident/4_berkeley_unified_mortality_database/bunmd.csv")

Recap of the decision rules we used to create the BUNMD:



caseybreen/wcensoc documentation built on Nov. 21, 2024, 5:15 a.m.