Category: Data Visualization

  • Using spaCy for Natural Language Processing

    What is spaCy? spaCy is an open-source natural language processing (NLP) library initially produced by the company explosion.ai. It has a large community following that has expanded its initial NLP capabilities across multiple languages (now 72+), expanded its capabilities for additional use-cases (such as clinical NLP with medspaCy), and integrates with other ML/DL frameworks (such…

  • Getting to better plots: the process for making figures

    Getting to better plots: the process for making figures

    Part of being a data scientist is understanding, generating insights from data, and presenting your results in a clear and concise way. Showing data tables will not reveal the game-changing sexy insights you found. That may just bore, confuse, or even anger your bosses and colleagues, or even worse – key stakeholders that are paying…

  • Why I chose MATLAB for learning data science

    Hint: it wasn’t for the pretty plots Note: This article was originally published in Towards Data Science on December 26, 2020. At the time of this writing, I am a fourth-year Ph.D. student at the University of Michigan, where I use machine learning and statistical modeling on biological datasets to study cancer metabolism and infer…