Join me on my explorations through data and software engineering. Hopefully these data-driven analyses will better inform us on real facts rather than fake news. My aim is to show how very quick analyses can uncover hidden patterns and useful information. I will also share my experience in coding the infrastructure as I'm very interested in data engineering as well. If you have any interesting questions about the world you'd like to ask, just shoot me a message through the contact page, and I'll give it a shot.
Heart disease is the leading cause of death worldwide. I found the dataset from the UCI repository which hosts a bunch of nice clean datasets, useful for learning and practicing data science. At first, I hoped to surpass the result of 78% accuracy which I did using commonly used classification algorithms.
With all the current events with Trump as the president of the United States, it would be good to use data to fact check his statment and policies given his track record of alternative facts. Taking data from the US Border Patrol on the Customs and Border Protection webpage re-posted on Kaggle, a did a very rough analysis. Some data cleaning and manipulation using pandas allowed for a useful visualization of caught immigrants at the borders of the United States: