The first two weeks with the Apple M1·
Apple recently published new computers that contain their new M1 processors. I was quite excited about them because of the promises made by various benchmarks regarding performance and energy consumption but also because it is also a new platform. Most things won’t work there and some assumption on how we work today have to change if you want to use...
Fast JDBC access in Python using pyarrow.jvm (2020 edition)·
About a year ago, I have benchmarked access databases through JDBC in Python. Recently, the maintainer of
jpypegave me a heads-up that they significantly improved performance on their side. While this is actually the library I’m comparing my
pyarrow.jvm-based approach to, I have a high appreciation for any performance tuning that is...
Calculating levenshtein distances with fletcher·
Levenshtein distance is a typical measure to compare two different strings. It gives you the minimal number of add, remove and replace operations to transition from one string to another.
Trimming down pyarrow’s conda footprint (Part 2 of X)·
We have again reduced the footprint of creating a conda environment with
pyarrow. This time we have done some detective work on the package contents and removed contents from
pyarrowthat are definitely not needed at runtime.
Removing Python as a dependency of R·
Surprisingly Python was a runtime dependency of R on conda-forge. As R doesn’t need Python to run, this was a bit weird. We got rid of this by splitting up the GLib package.