What is spaCy? spaCy is an open-source natural language processing (NLP) library initially produced by the company explosion.ai. It has a large community following that has expanded its initial NLP capabilities across multiple languages (now 72+), expanded its capabilities for additional use-cases (such as clinical NLP with medspaCy), and integrates with other ML/DL frameworks (such…
Test-driven development is a common software development approach that facilitates test automation, code refactoring, and to validate code functionality. This article goes over pytest, a popular testing framework to write tests in Python. This GitHub repo contains the code snippets presented in this article. Motivation for using pytest unittest and the Arrange-Act-Assert model The Arrange-Act-Assert…
What is Big Data? Big Data can be described as a highly rich dataset with several data types (think tabular, image, audio, etc) that is arriving in large volumes at a high velocity. For the data scientist, Big Data poses several problems and opportunities. The Attributes That Describe Big Data From a practical standpoint, there…
What is ChatGPT? ChatGPT is an artificial intelligence language model produced by OpenAI that generates human-like text responses in real-time. It uses machine learning to generate responses based on the input it receives, allowing it to have conversations with its users naturally and intuitively. ChatGPT is trained using two methods: Supervised Fine-Tuning and Reinforcement Learning…
Part of being a data scientist is understanding, generating insights from data, and presenting your results in a clear and concise way. Showing data tables will not reveal the game-changing sexy insights you found. That may just bore, confuse, or even anger your bosses and colleagues, or even worse – key stakeholders that are paying…
A practical guide to getting started in Deep Learning Note: This article was originally published in Towards Data Science on December 30, 2020. What is Deep Learning? Deep learning is a subset of machine learning algorithms that use neural networks to learn complex patterns from large amounts of data. Due to advances in computing and…