This project is part of the Data Science and Visualization Bootcamp at UCSD Extension.
The aim of this project is to perform ETL and merge two datasets - PPP loans and COVID-19 cases for California - to allow researchers to evaluate for a possible link between COVID-19 infection cases and receipt of PPP loans. This project was co-engineered by four people: Stephen Hong, Laura Paakh May, Nghia Nguyen, and Sagar Patel.
Step 1 - Extract the data Searched Kaggle and downloaded four databases: A. PPP loan B. US county demographics C. US county and covid cases D. US county and corresponding zip codes
Step 2 - ERD Create Entity Relationship Diagram.
Step 3 - Transform Cleaned the data using python and jupyter notebook.
Step 4 - SQL Created SQL tables using postgreSQL.
Step 5 - Load Connect to PostgreSQL and upload data to tables.
- data exploration/descriptive statistics
- data processing/cleaning
- statistical modeling
- ERD diagram modeling
- database loading/table design
- writeup/reporting
