Pivot Tables
- Since I major in Interdisciplinary Science with a concentration in the data science of Public Health, I wanted to explore a topic related to public health issues that affect minorities across the tri-state area.
To start, I want to explore a dataset showing the annual report of HIV/AIDS in New York City. This dataset collected from the NYC Opendata provides data showing the amount of people affected by HIV/AIDS in NYC. The data consist of variables like different genders, ages, race, and, diagnosis and death rates.
I then want to take this information a step further and explore the dataset of people affected by HIV/AIDS in each borough and neighborhood. This will allow me to connect the two datasets together to get an understanding of a public health issue that continues to affect people in NYC.
Questions I want to ask:
- Which borough has the highest percentage of people affected by HIV/AIDs
- Which borough has the lowest percentage of people affected by HIV/AIDs
- Which gender has highest percentage of HIV/AIDs
- Which race has the highest percentage of people affected by HIV/AIDs
- How many people have died from each borough from HIV/AIDs
Sources: NycOpen Data :
https://data.cityofnewyork.us/Health/DOHMH-HIV-AIDS-Annual-Report/fju2-rdad/data
https://data.cityofnewyork.us/Health/HIV-AIDS-Diagnoses-by-Neighborhood-Sex-and-Race-Et/ykvb-493p
2. For my second option, I want to explore a dataset, (also public health oriented), on the Ebola crisis that originated and rapidly spread in African countries in 2014. (**Note Ebola existed since 1976, but came back in 2014).
Questions I want to ask:
- Which country has the highest amount of Ebola
- Which country has the country amount of Ebola
- Which date did Ebola arise for each country
Source: Kaggle
https://www.kaggle.com/imdevskp/ebola-outbreak-20142016-complete-dataset