Thoughts and Theory

Delving a bit into what’s happening under the hood when you manipulate your Excel file or data frames.

Image for post
Image for post
Photo by Christian Wiediger on Unsplash

I consider myself as an applied programmer, I use high-level programming languages like Python to manipulate the data frames, build simple pipelines to facilitate the data analysis of certain tasks. However, at the atomic level, the computer itself doesn’t understand what “pandas.read_csv()” means or what the “write.table()” function actually does. It makes me wonder about the intermediate steps involved in converting our commands to 0/1 binary bits and prompts me to write this article.

In this article, I am hoping to give you a rough and intuitive image of what the disk actually looks like, how your data lays out…


A teaser of useful Shell commands that can drastically increase your productivity

Image for post
Image for post
Photo by Sai Kiran Anagani on Unsplash

When it comes to file parsing or data preprocessing, what would be the first programming language that comes to your mind?

It might be Python, R, or some other similar scripting languages. Granted, these modern and high-level languages are very powerful and empower us to achieve our goals usually in less than a few dozens of lines of codes. However, Linux Shell commands seem to be a forgotten pearl because it is relatively old syntax and less intuitive tutorials online.

In this article, I am going to let you get a flavor about how Shell command can be super powerful…


ScPyT

A thorough Linear Algebra Bootcamp as a Machine learning Practitioner

Image for post
Image for post
Photo by Antoine Dautry on Unsplash

As a data scientist or machine learning practitioner, how good is your linear algebra?

No matter you have a positive or negative answer to this question, hopefully, after reading this post and practicing a bit, you will be able to grasp most of the Linear Algebra you need to know for your day-to-day work.

Well… I know it may sound a bit exaggerating, how is that possible? The reason is, we do not need to know or actually calculate everything by hand. Off-the-shelf packages like Python Numpy or Matlab have already done a lot of hard work for us underneath…


ScPyT

Less-known but useful knowledge and tricks about ndarray object and Numpy package

Image for post
Image for post
Photo by Faris Mohammed on Unsplash

I recently started a hard task — — reading the whole Numpy documentation, in particular, the API reference of the latest release of the Numpy module (1.19). The reason for doing that is simple, I am a bioinformatics PhD student and my research focus on developing computational tools in a wide spectrum of biological problems. As my project progress, I found out that my lack of knowledge of Numpy greatly hinders my ability to quickly and accurately find optimal solutions and I wasted a huge amount of time searching for certain commands on, Stackoverflow, as an example. …


A compendium of useful, interesting, inspirational usages of Python Pandas library

Image for post
Image for post
Photo by Tolga Ulkan on Unsplash

Let’s talk about the Pandas package.

When you browse through Stackoverflow or reading blogs on Toward Data Science, have you ever encountered some super elegant solutions (maybe just one line) that can replace your dozens of lines codes (for loop, functions)?

I encountered that kind of situation a lot, and I was often like, “Wow, I didn’t know this function can be used in this way, TRULY amazing!” Different people will have different excitement point for sure, but I bet these moments have occurred to your before if you ever work in the applied data science field.

However, one thing…


python-visualization-tutorial

When should I use Seaborn versus matplotlib, and how to use it?

Image for post
Image for post
Photo by Donald Giannatti on Unsplash

This is my last tutorial for my python visualization series:

  1. The tutorial I: Fig and Ax object
  2. Tutorial II: Line plot, legend, color
  3. Tutorial III: box plot, bar plot, scatter plot, histogram, heatmap, colormap
  4. Tutorial IV: violin plot, dendrogram
  5. Tutorial V: Plots in Seaborn (cluster heatmap, pair plot, dist plot, etc)

You don’t need to read all previous posts, and this one would be a bit separated from my last four articles. I am going to show you a head-to-head comparison between the matplotlib library and the Seaborn library in python.

As I alluded to in my tutorial I, I…


python-visualization-tutorial

Drawing violin plot and dendrogram from the scratch, a step-by-step guide

Image for post
Image for post
Photo by Myriam Jessier on Unsplash

This is the fourth tutorial of my python visualization series,

  1. The tutorial I: Fig and Ax object
  2. Tutorial II: Line plot, legend, color
  3. Tutorial III: box plot, bar plot, scatter plot, histogram, heatmap, colormap
  4. Tutorial IV: violin plot, dendrogram
  5. Tutorial V: Plots in Seaborn (cluster heatmap, pair plot, dist plot, etc)

Let’s have a short intro this time, this article just resumes from where Tutorial III left off. The reason I want to write a separate article for the Violin plot and dendrogram is that they are a little bit involved compared to the previously-covered plot types. …


Hands-On Tutorials

Walking you through how to understand the mechanisms behind these widely-used figure types

Image for post
Image for post
Photo by Myriam Jessier on Unsplash

This is my third tutorial, here’s a list of all my previous posts and the ones I am going to post very soon:

  1. The tutorial I: Fig and Ax object
  2. Tutorial II: Line plot, legend, color
  3. Tutorial III: box plot, bar plot, scatter plot, histogram, heatmap, colormap
  4. Tutorial IV: violin plot, dendrogram
  5. Tutorial V: Plots in Seaborn (cluster heatmap, pair plot, dist plot, etc)

Why I make this tutorial? What’s the reason you need to spend your precious time reading this article? I want to share the most critical thing about learning matplotlib, which is understanding the building block of…


python-visualization-tutorial

Learning how to make a line plot, understanding legends, and colors

Image for post
Image for post
Photo by Cookie the Pom on Unsplash

This is the second article of my python visualization tutorial: making publication-quality figures. Here is a list of articles I have posted so far and will post soon:

  1. The tutorial I: Fig and Ax object
  2. Tutorial II: Line plot, legend, color
  3. Tutorial III: box plot, bar plot, scatter plot, histogram, heatmap, colormap
  4. Tutorial IV: violin plot, dendrogram
  5. Tutorial V: Plots in Seaborn (cluster heatmap, pair plot, dist plot, etc)

If you haven’t checked the tutorial I, I recommend doing that first before proceeding with reading this article. …


python-visualization-tutorial

How to fully understand and control all the plotting parameters by yourself

Image for post
Image for post
Photo by Michael Dziedzic on Unsplash

In this whole series, I will share with you how I usually make publication-quality figures in Python. I want to really convey the ideas of how to gain full control of every element in a python plot. By the end of this tutorial, you will be able to fully understand the philosophy of making figures in Python.

Since it will be a huge topic, I decide to break it down to several parts. In this Part I tutorial, I will focus on the very first step — understanding the canvas (Fig object) you will be drawing on and the boundary…

Guangyuan(Frank) Li

Bioinformatics PhD student at Cincinnati Children's Hospital Medical Center; GitHub: https://github.com/frankligy

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store