Introduction You’d like to visualize some stock data using Go, but after looking at the Go ecosystem you see very little in charting. You find gonum, which has some plotting capabilities, but it generates static charts. It’s 2022, and you’d like to have interactive features such as zooming, panning, and more. You turn to the HTML landscape, and see many more options and decide to take this path. After a short survey, you decide to use plotly.
Continue readingIntroduction You are jogging and want to show off your route to your friends. Let’s imagine the data you have for your route is a CSV file in the following format:
Listing 1: track.csv
time,lat,lng,height 2015-08-20 03:48:07.235,32.519585,35.015021,136.1999969482422 2015-08-20 03:48:24.734,32.519606,35.014954,126.5999984741211 2015-08-20 03:48:25.660,32.519612,35.014871,123.0 2015-08-20 03:48:26.819,32.519654,35.014824,120.5 2015-08-20 03:48:27.828,32.519689,35.014776,118.9000015258789 2015-08-20 03:48:29.720,32.519691,35.014704,119.9000015258789 2015-08-20 03:48:30.669,32.519734,35.014657,120.9000015258789 Listing 1 shows the first few lines of track.csv. Each line contains a time stamp in UTC, latitude, longitude and height above sea level in meters.
Continue readingIntroduction You are about to visit Boston, and would like to taste some good food. You ask your friend who lives there what are good places to eat. They reply with “Everything is good, you can’t go wrong”. Which makes you think, maybe I should check where not to eat.
The data geek in you arises, and you find out that the city of Boson has a dataset of food violations.
Continue readingThe Question When you work on data science problems, you always start with a question you’re trying to answer. This question will affect the data you pick, your exploration process, and how you interpret the results.
The question for this article is: How much (in percentage) should you tip your taxi driver?
To answer the question, we’ll use a portion of the NYC Taxi dataset. The data file we’ll be using is taxi-01-2020-sample.
Continue readingIntroduction You write a server for a massively multiplayer online role-playing game (MMORPG).
In the game, players collect keys and you want to design how to store the set of keys each player has.
As an example, imagine the set of keys are copper, jade and crystal. You consider the following options for storing a player key sets:
[]string map[string]bool Both options will work, but did you consider a third option of using a bitmask?
Continue readingIntroduction If you can write a for-loop, you can do statistics. - Jake Vanderplas
A lot of developers shy away from problems which involve statistics or probability. Which is shameful since in today’s data-rich environment, you can gain a lot of insights from data.
In this blog post, I’ll show you how to write a simulation tool which requires no knowledge in statistics or probability. Simulations are easy to write and can be a very effective tool in research.
Continue readingIntroduction I prefer to use relational (SQL) databases in general since they provide several features that are very useful when working with data. SQLite is a great choice since the database is a single file, which makes it easier to share data. Even though it’s a single file, SQLite can handle up to 281 terabytes of data. SQLite also comes with a command line client called sqlite3 which is great for quick prototyping.
Continue readingIntroduction Every single company I’ve worked at and talked to has the same problem without a single exception so far - poor data quality, especially tracking data. Either there’s incomplete data, missing tracking data, duplicative tracking data. - DJ Patil
I spend a lot of my time digging into data at various companies. Most of the time I’m surprised by what I see and so are the engineers and analysts that work at these companies.
Continue readingSeries Index Generics Part 01: Basic Syntax
Generics Part 02: Underlying Types
Generics Part 03: Struct Types and Data Semantics
Introduction In the previous post, I showed you how to declare a user-defined type, based on an underlying type. I did this through the progression of writing different versions of the same type using concrete types, the empty interface and then finally, generics. I also provided information on how the compiler was limited in its ability to infer the substitution for the generic type during zero value construction, but it could with initialized construction.
Continue readingSeries Index Python and Go: Part I - gRPC
Python and Go: Part II - Extending Python With Go
Python and Go: Part III - Packaging Python Code
Python and Go: Part IV - Using Python in Memory
Introduction In a previous post we used gRPC to call Python code from Go. gRPC is a great framework, but there is a performance cost to it. Every function call needs to marshal the arguments using protobuf, make a network call over HTTP/2, and then un-marshal the result using protobuf.
Continue reading