blog > Data Analysis for Medical Students

24 Feb 2022 dsai

Data Analysis for Medical Students


Last December I held a workshop at my university about basic data analysis using the Python programming language. It was pitched at an undergraduate medical student level, specifically for those who have never written any code before. Over the course of one afternoon we had a crash course covering the basics of how to use Python in Google Colaboratory, as well as giving a quick tour of the popular pandas, scipy and seaborn python libraries. We also work through a mock scenario where we analyse spirometry data (a test involving breathing into a tube to see how good your lungs are) from the NHANES dataset.

Though the slides and worksheets were intended to be used as part of guided online/zoom workshop, it should be possible to read the slides and try out the worksheet as a self-guided exercise. I’m uploading the links here for anyone who wants to try them out.

To use these materials, I would suggest viewing one set of slides and then completing the accompanying Colab worksheet before moving on to the next slide deck. “Extra credit” sections of the worksheets are optional. Some of the slides use animations to explain code and how it works under the hood. View these in “slideshow preview” mode to get the most out of the slides.

The exercises involved also assumes some prior knowledge of statistical testing; it only covers the execution and not the motivation behind these tests (which is better taught in a statistics class anyways).

If you are curious and want to learn more about data analysis, I highly recommend Kaggle Courses as a more comprehensive tour of the topic. If you want to dive head-first into AI and want to get something up and running fast, I also recommend the Course.


Part 1: Python + Colab 101SlidesWorksheet
Part 2: Data manipulation with pandasSlidesWorksheet
Part 3: Statistical testing with scipySlidesWorksheet
Part 4: Data visualization with seabornSlidesWorksheet

Happy coding!