I love beautiful visualizations. For the last ~2 years, I have spent some time looking for the best data visualization tools in Python. I have tested the most well-known libraries in this large and diverse ecosystem, but I have yet to find the perfect definitive one. Here are some thoughts about practicing data visualization Python. For a deeper dive on the subject, you will find some links at the end of this post.
- Matplotlib, Seaborn and Pandas are here to stay. They offer rich static visualizations, are easy to install and to use everywhere. Their API might not be the most innovative, but they are fast and convenient for exploratory data analysis. A bit of tweaking can make the plots beautiful. The main drawback is their static nature.
- Plotly is very good, rich, dynamic, built on top of D3. Their recent addition of Plotly Express, a shorter API similar to Seaborn, was clever (no more need for Cufflinks which emulated the Pandas API for Plotly). The documentation is quite good, making it accessible to many. It is a very solid choice for dynamic visualizations.
- Altair is complex but very powerful. It is mainly a Python API for Vega-Lite, so it is sane but requires some learning. Its user guide and examples are great, but the detailed documentation is automatically generated, so not that useful. Altair is capable of remarkable advanced interactions I have not been able to achieve easily with any other tool. Sadly the graphs it generates are too heavy because they carry the whole dataset. However they are gorgeous out of the box (the most beautiful I have seen).
- Bokeh and its wrappers HoloViews and hvplot are powerful too. Capable of many things, but in my experience not as flexible as Plotly and not as beautiful as Altair.
- Chartify by Spotify is similar to Seaborn or Plotly Express, but only offers static plots using Bokeh as a backend. The API and the plots look nice, but the scope seems limited to data scientists building standard visualizations for communication.
- bqplot is a bit different: it lives in the Jupyter ecosystem and offers vast possibilities for interaction. The documentation consists primarily of examples, so it can be difficult to use, and its syntax is complex. The defaults don’t look great. This is restricted to specific use cases for advanced interactive notebooks, notably paired with Jupyter’s voila.
- I haven’t played much with plotnine, the Python library that is probably the closest to R’s very respected ggplot2. Its promise is great, but it works on Matplotlib so it’s only static. The poor documentation (which invites users to consult ggplot2’s documentation as a complement) makes it hard to start using it for complex plots.
- For building dashboards in Python, Dash by Plotly is becoming the king. It is powerful, immediately looks good, and offers a plethora of components and possibilities for interaction and customization. I have only very briefly used the alternative Panel made by the people from HoloViews. It looks a bit rough but promises a lot, namely being able to directly use any visualization library. The way the callbacks are defined seemed a bit difficult to grasp.
Some resources on the Python data visualization ecosystem: