Ziheng WanginLanceDBOpen Vector Data LakesWhy DataFrame libraries need to understand vector embeddingsMay 22, 20231May 22, 20231
Ziheng WangThis is a very interesting rebuttal as I ponder replacing Kafka with just an S3 bucket in my…Jul 23, 20221Jul 23, 20221
Ziheng WanginArtificial Intelligence in Plain EnglishLarge Scale Deep Learning with Spark — An Opinionated GuideIf you are reading this post, I assume you know what Spark is and has heard one thing or two about deep learning. I won’t waste any more…May 18, 2022May 18, 2022
Ziheng WangHow to do fast sparse int8 multiplications on CPUs to speed up deep learning inferenceThis medium blog post is going to be highly technical — and though it has quite a lot to do with deep learning (if recent industry…Oct 23, 2021Oct 23, 2021
Ziheng WangExtremely Fast and Cheap Decision TreesGradient-boosted trees (GBT) dominate most data science applications in industry (and Kaggle) due to their superior accuracy and…Jul 14, 2021Jul 14, 2021
Ziheng WanginTowards Data ScienceSpeeding up BERT inference: different approachesIntroJan 21, 20211Jan 21, 20211
Ziheng WanginTowards Data ScienceSpeeding up deep learning inference via unstructured sparsityServing large neural networks can be expensive.Sep 8, 20201Sep 8, 20201