home

# Julia Evans

http://jvns.ca/atom.xml
2019-02-17T20:55:37+00:00
https://jvns.ca/atom.xml

Today I organized the front page of this blog (jvns.ca) into CATEGORIES! Now it is actually possible to make some sense of what is on here!! There are 28 categories (computer networking! learning! “how things work”! career stuff! many more!) I am so excited about this.

How it works: Every post is in only 1 category. Obviously the categories aren’t “perfect” (there is a “how things work” category and a “kubernetes” category and a “networking” category, and so for a “how container networking works in kubernetes” I need to just pick one) but I think it’s really nice and I’m hoping that it’ll make the blog easier for folks to navigate.

If you’re interested in more of the story of how I’m thinking about this: I’ve been a little dissatisfied for a long time with how this blog is organized. Here’s where I started, in 2013, with a pretty classic blog layout (this is Octopress, which was a Jekyll Wordpress-lookalike theme that was cool back then and which served me very well for a long time):

### problem with “show the 5 most recent posts”: you don’t know what the person’s writing is about!

This is a super common way to organize a blog: on the homepage of your blog, you display maybe the 5 most recent posts, and then maybe have a “previous” link.

1. it’s hard to hunt through their back catalog to find cool things they’ve written
2. it’s SO HARD to get an overall sense for the body of a person’s work by reading 1 blog post at a time

### next attempt: show every post in chronological order

My next attempt at blog organization was to show every post on the homepage in chronological order. This was inspired by Dan Luu’s blog, which takes a super minimal approach. I switched to this (according to the internet archive) sometime in early 2016. Here’s what it looked like (with some CSS issues :))

The reason I like this “show every post in chronological order” approach more is that when I discover a new blog, I like to obsessively binge read through the whole thing to see all the cool stuff the person has written. Rachel by the bay also organizes her writing this way, and when I found her blog I was like OMG WOW THIS IS AMAZING I MUST READ ALL OF THIS NOW and being able to look through all the entries quickly and start reading ones that caught my eye was SO FUN.

Will Larson’s blog also has a “list of all posts” page which I find useful because it’s a good blog, and sometimes I want to refer back to something he wrote months ago and can’t remember what it was called, and being able to scan through all the titles makes it easier to do that.

I was pretty happy with this and that’s how it’s been for the last 3 years.

### problem: a chronological list of 390 posts still kind of sucks

As of today, I have 390 posts here (360,000 words! that’s, like, 4 300-page books! eep!). This is objectively a lot of writing and I would like people new to the blog to be able to navigate it and actually have some idea what’s going on.

And this blog is not actually just a totally disorganized group of words! I have a lot of specific interests: I’ve written probably 30 posts about computer networking, 15ish on ML/statistics, 20ish career posts, etc. And when I write a new Kubernetes post or whatever, it’s usually at least sort of related to some ongoing train of thought I have about Kubernetes. And it’s totally obvious to me what other posts that post is related to, but obviously to a new person it’s not at all clear what the trains of thought are in this blog.

### solution for now: assign every post 1 (just 1) category

My new plan is to assign every post a single category. I got this idea from Itamar Turner-Trauring’s site.

Here are the initial categories:

• Cool computer tools / features / ideas
• Computer networking
• How a computer thing works
• Kubernetes / containers
• Zines / comics
• On writing comics / zines
• Conferences
• Organizing conferences
• Statistics / machine learning / data analysis
• Year in review
• Infrastructure / operations engineering
• Career / work
• Working with others / communication
• Remote work
• Talks transcripts / podcasts
• On blogging / speaking
• On learning
• Rust
• Linux debugging / tracing tools
• Debugging stories
• Fan posts about awesome work by other people
• Inclusion
• rbspy
• Performance
• Open source
• Linux systems stuff
• Recurse Center (my daily posts during my RC batch)

I guess you can tell this is a systems-y blog because there are 8 different systems-y categories (kubernetes, infrastructure, linux debugging tools, rust, debugging stories, performance, and linux systems stuff, how a computer thing works) :).

But it was nice to see that I also have this huge career / work category! And that category is pretty meaningful to me, it includes a lot of things that I struggled with and were hard for me to learn. And I get to put all my machine learning posts together, which is an area I worked in for 3 years and am still super interested in and every so often learn a new thing about!

### How I assign the categories: a big text file

I came up with a scheme for assigning the categories that I thought was really fun! I knew immediately that coming up with categories in advance would be impossible (how was I supposed to know that “fan posts about awesome work by other people” was a substantial category?)

So instead, I took kind of a Marie Kondo approach: I wrote a script to just dump all the titles of every blog post into a text file, and then I just used vim to organize them roughly into similar sections. Seeing everything in one place (a la marie kondo) really helped me see the patterns and figure out what some categories were.

Here’s the final result of that text file. I think having a lightweight way of organizing the posts all in one file made a huge difference and that it would have been impossible for me to seen the patterns otherwise.

### How I implemented it: a hugo taxonomy

Once I had that big text file, I wrote a janky python script to assign the categories in that text file to the actual posts.

I use Hugo for this blog, and so I also needed to tell Hugo about the categories. This blog already technically has tags (though they’re woefully underused, I didn’t want to delete them). I use Hugo, and it turns out that in Hugo you can define arbitrary taxonomies. So I defined a new taxonomy for these sections (right now it’s called, unimaginitively, juliasections).

The details of how I did this are pretty boring but here’s the hugo template that makes it display on the homepage. I used this Hugo documentation page on taxonomies a lot.

### organizing my site is cool! reverse chronology maybe isn’t the best possible thing!

Amy Hoy has this interesting article called how the blog broke the web about how the rise of blog software made people adopt a site format that maybe didn’t serve what they were writing the best.

I don’t personally feel that mad about the blog / reverse chronology organization: I like blogging! I think it was nice for the first 6 years or whatever to be able to just write things that I think are cool without thinking about where they “fit”. It’s worked really well for me.

But today, 360,000 words in, I think it makes sense to add a little more structure :).

### what it looks like now!

Here’s what the new front page organization looks like! These are the blogging / learning / rust sections! I think it’s cool how you can see the evolution of some of my thinking (I sure have written a lot of posts about asking questions :)).

### I ❤ the personal website

This is also part of why I love having a personal website that I can organize any way I want: for both of my main sites (jvns.ca and now wizardzines.com) I have total control over how they appear! And I can evolve them over time at my own pace if I decide something a little different will work better for me. I’ve gone from a jekyll blog to octopress to a custom-designed octopress blog to Hugo and made a ton of little changes over time. It’s so nice.

I think it’s fun that these 3 screenshots are each 3 years apart – what I wanted in 2013 is not the same as 2016 is not the same as 2019! This is okay!

And I really love seeing how other people choose to organize their personal sites! Please keep making cool different personal sites.

### !!Con 2019: submit a talk!

As some of you might know, for the last 5 years I’ve been one of the organizers for a conferences called !!Con. This year it’s going to be held on May 11-12 in NYC.

The submission deadline is Sunday, March 3 and you can submit a talk here.

(we also expanded to the west coast this year: !!Con West is next week!! I’m not on the !!Con West team since I live on the east coast but they’re doing amazing work, I have a ticket, and I’m so excited for there to be more !!Con in the world)

### !!Con is about the joy, excitement, and surprise of computing

Computers are AMAZING. You can make programs that seem like magic, computer science has all kind of fun and surprising tidbits, there are all kinds of ways to make really cool art with computers, the systems that we use every day (like DNS!) are often super fascinating, and sometimes our computers do REALLY STRANGE THINGS and it’s very fun to figure out why.

!!Con is about getting together for 2 days to share what we all love about computing. The only rule of !!Con talks is that the talk has to have an exclamation mark in the title :)

We originally considered calling !!Con ExclamationMarkCon but that was too unwieldy so we went with !!Con :).

### !!Con is inclusive

The other big thing about !!Con is that we think computing should include everyone. To make !!Con a space where everyone can participate, we

• have open captioning for all talks (so that people who can’t hear well can read the text of the talk as it’s happening). This turns out to be great for LOTS of people – if you just weren’t paying attention for a second, you can look at the live transcript to see what you missed!
• pay our speakers & pay for speaker travel
• have a code of conduct (of course)
• use the RC social rules
• make sure our washrooms work for people of all genders
• let people specify on their badges if they don’t want photos taken of them
• do a lot of active outreach to make sure our set of speakers is diverse

### past !!Con talks

I think maybe the easiest way to explain !!Con if you haven’t been is through the talk titles! Here are a few arbitrarily chosen talks from past !!Cons:

If you want to see more (or get an idea of what !!Con talk descriptions usually look like), here’s every past year of the conference:

### this year you can also submit a play / song / performance!

One difference from previous !!Cons is that if you want submit a non-talk-talk to !!Con this year (like a play!), you can! I’m very excited to see what people come up with. For more of that see Expanding the !!Con aesthetic.

### all talks are reviewed anonymously

One big choice that we’ve made is to review all talks anonymously. This means that we’ll review your talk the same way whether you’ve never given a talk before or if you’re an internationally recognized public speaker. I love this because many of our best talks are from first time speakers or people who I’d never heard of before, and I think anonymous review makes it easier to find great people who aren’t well known.

### writing a good outline is important

We can’t rely on someone’s reputation to determine if they’ll give a good talk, but we do need a way to see that people have a plan for how to present their material in an engaging way. So we ask everyone to give a somewhat detailed outline explaining how they’ll spend their 10 minutes. Some people do it minute-by-minute and some people just say “I’ll explain X, then Y, then Z, then W”.

Lindsey Kuper wrote some good advice about writing a clear !!Con outline here which has some examples of really good outlines which you can see here.

!!Con is pay-what-you-can (if you can’t afford a $300 conference ticket, we’re the conference for you!). Because of that, we rely on our incredible sponsors (companies who want to build an inclusive future for tech with us!) to help make up the difference so that we can pay our speakers for their amazing work, pay for speaker travel, have open captioning, and everything else that makes !!Con the amazing conference it is. If you love !!Con, a huge way you can help support the conference is to ask your company to sponsor us! Here’s our sponsorship page and you can email me at julia@jvns.ca if you’re interested. ### hope to see you there ❤ I’ve met so many fantastic people through !!Con, and it brings me a lot of joy every year. The thing that makes !!Con great is all the amazing people who come to share what they’re excited about every year, and I hope you’ll be one of them. ### Networking tool comics! Hello! I haven’t been blogging too much recently because I’m working on a new zine project: Linux networking tools! I’m pretty excited about this one – I LOVE computer networking (it’s what I spent a big chunk of the last few years at work doing), but getting started with all the tools was originally a little tricky! For example – what if you have the IP address of a server and you want to make a https connection to it and check that it has a valid certificate? But you haven’t changed DNS to resolve to that server yet (because you don’t know if it works!) so you need to use the IP address? If you do curl https://1.2.3.4/, curl will tell you that the certificate isn’t valid (because it’s not valid for 1.2.3.4). So you need to know to do curl https://jvns.ca --resolve jvns.ca:443:104.198.14.52. I know how to use curl --resolve because my coworker told me how. And I learned that to find out when a cert expires you can do openssl x509 -in YOURCERT.pem -text -noout the same way. So the goal with this zine is basically to be “your very helpful coworker who gives you tips about how to use networking tools” in case you don’t have that person. And as we know, a lot of these tools have VERY LONG man pages and you only usually need to know like 5 command line options to do 90% of what you want to do. For example I only ever do maybe 4 things with openssl even though the openssl man pages together have more than 60,000 words. There are a few things I’m also adding (like ethtool and nmap and tc) which I don’t personally use super often but I think are super useful to people with different jobs than me. And I’m a big fan of mixing more advanced things (like tc) with basic things (like ssh) because then even if you’re learning the basic things for the first time, you can learn that the advanced thing exists! Here’s some work in progress: It’s been super fun to draw these: I didn’t know about ssh-copy-id or ~. before I made that ssh comic and I really wish I’d known about them earlier! As usual I’ll announce the zine when it comes out here, or you can sign up for announcements at https://wizardzines.com/mailing-list/. ### A few early marketing thoughts At some point last month I said I might write more about business, so here are some very early marketing thoughts for my zine business (https://wizardzines.com!). The question I’m trying to make some progress on in this post is: “how to do marketing in a way that feels good?” ### what’s the point of marketing? Okay! What’s marketing? What’s the point? I think the ideal way marketing works is: 1. you somehow tell a person about a thing 2. you explain somehow why the thing will be useful to them / why it is good 3. they buy it and they like the thing because it’s what they expected (or, when you explain it they see that they don’t want it and don’t buy it which is good too!!) So basically as far as I can tell good marketing is just explaining what the thing is and why it is good in a clear way. ### what internet marketing techniques do people use? I’ve been thinking a bit about internet marketing techniques I see people using on me recently. Here are a few examples of internet marketing techniques I’ve seen: 1. word of mouth (“have you seen this cool new thing?!”) 2. twitter / instagram marketing (build a twitter/instagram account) 3. email marketing (“build a mailing list with a bajillion people on it and sell to them”) 4. email marketing (“tell your existing users about features that they already have that they might want to use”) 5. social proof marketing (“jane from georgia bought a sweater”), eg fomo.com 6. cart notifications (“you left this sweater in your cart??! did you mean to buy it? maybe you should buy it!“) 7. content marketing (which is fine but whenever people refer to my writing as ‘content’ I get grumpy :)) ### you need some way to tell people about your stuff Something that is definitely true about marketing is that you need some way to tell new people about the thing you are doing. So for me when I’m thinking about running a business it’s less about “should i do marketing” and more like “well obviously i have to do marketing, how do i do it in a way that i feel good about?” ### what’s up with email marketing? I feel like every single piece of internet marketing advice I read says “you need a mailing list”. This is advice that I haven’t really taken to heart – technically I have 2 mailing lists: 1. the RSS feed for this blog, which sends out new blog posts to a mailing list for folks who don’t use RSS (which 3000 of you get) 2. https://wizardzines.com's list, for comics / new zine announcements (780 people subscribe to that! thank you!) but definitely neither of them is a Machine For Making Sales and I’ve put in almost no efforts in that direction yet. here are a few things I’ve noticed about marketing mailing lists: • most marketing mailing lists are boring but some marketing mailing lists are actually interesting! For example I kind of like amy hoy’s emails. • Someone told me recently that they have 200,000 people on their mailing list (?!!) which made the “a mailing list is a machine for making money” concept make a lot more sense to me. I wonder if people who make a lot of money from their mailing lists all have huge 10k+ person mailing lists like this? ### what works for me: twitter Right now for my zines business I’d guess maybe 70% of my sales come from Twitter. The main thing I do is tweet pages from zines I’m working on (for example: yesterday’s comic about ss). The comics are usually good and fun so invariably they get tons of retweets, which means that I end up with lots of followers, which means that when I later put up the zine for sale lots of people will buy it. And of course people don’t have to buy the zines, I post most of what ends up in my zines on twitter for free, so it feels like a nice way to do it. Everybody wins, I think. (side note: when I started getting tons of new followers from my comics I was actually super worried that it would make my experience of Twitter way worse. That hasn’t happened! the new followers all seem totally reasonable and I still get a lot of really interesting twitter replies which is wonderful ❤) I don’t try to hack/optimize this really: I just post comics when I make them and I try to make them good. ### a small Twitter innovation: putting my website on the comics Here’s one small marketing change that I made that I think makes sense! In the past, I didn’t put anything about how to buy my comics on the comics I posted on Twitter, just my Twitter username. Like this: After a while, I realized people were asking me all the time “hey, can I buy a book/collection? where do these come from? how do I get more?“! I think a marketing secret is “people actually want to buy things that are good, it is useful to tell people where they can buy things that are good”. So just recently I’ve started adding my website and a note about my current project on the comics I post on Twitter. It doesn’t say much: just “❤ these comics? buy a collection! wizardzines.com” and “page 11 of my upcoming bite size networking zine”. Here’s what it looks like: I feel like this strikes a pretty good balance between “julia you need to tell people what you’re doing otherwise how are they supposed to buy things from you” and “omg too many sales pitches everywhere”? I’ve only started doing this recently so we’ll see how it goes. ### should I work on a mailing list? It seems like the same thing that works on twitter would work by email if I wanted to put in the time (email people comics! when a zine comes out, email them about the zine and they can buy it if they want!). One thing I LOVE about Twitter though is that people always reply to the comics I post with their own tips and tricks that they love and I often learn something new. I feel like email would be nowhere near as fun :) But I still think this is a pretty good idea: keeping up with twitter can be time consuming and I bet a lot of people would like to get occasional email with programming drawings. (would you?) One thing I’m not sure about is – a lot of marketing mailing lists seem to use somewhat aggressive techniques to get new emails (a lot of popups on a website, or adding everyone who signs up to their service / buys a thing to a marketing list) and while I’m basically fine with that (unsubscribing is easy!), I’m not sure that it’s what I’d want to do, and maybe less aggressive techniques will work just as well? We’ll see. ### should I track conversion rates? A piece of marketing advice I assume people give a lot is “be data driven, figure out what things convert the best, etc”. I don’t do this almost at all – gumroad used to tell me that most of my sales came from Twitter which was good to know, but right now I have basically no idea how it works. Doing a bunch of work to track conversion rates feels bad to me: it seems like it would be really easy to go down a dumb rabbit hole of “oh, let’s try to increase conversion by 5%” instead of just focusing on making really good and cool things. My guess is that what will work best for me for a while is to have some data that tells me in broad strokes how the business works (like “about 70% of sales come from twitter”) and just leave it at that. ### should I do advertising? I had a conversation with Kamal about this post that went: • julia: “hmm, maybe I should talk about ads?” • julia: “wait, are ads marketing?” • kamal: “yes ads are marketing” So, ads! I don’t know anything about advertising except that you can advertise on Facebook or Twitter or Google. Some non-ethical questions I have about advertising: • how do you choose what keywords to advertise on? • are there actually cheap keywords, like is ‘file descriptors’ cheap? • how much do you need to pay per click? (for some weird linux keywords, google estimated 20 cents a click?) • can you use ads effectively for something that costs$10?

This seems nontrivial to learn about and I don’t think I’m going to try soon.

### other marketing things

a few other things I’ve thought about:

• I learned about “social proof marketing” sites like fomo.com yesterday which makes popups on your site like “someone bought COOL THING 3 hours ago”. This seems like it has some utility (people are actually buying things from me all the time, maybe that’s useful to share somehow?) but those popups feel a bit cheap to me and I don’t really think it’s something I’d want to do right now.
• similarly a lot of sites like to inject these popups like “HELLO PLEASE SIGN UP FOR OUR MAILING LIST”. similar thoughts. I’ve been putting an email signup link in the footer which seems like a good balance between discoverable and annoying. As an example of a popup which isn’t too intrusive, though: nate berkopec has one on his site which feels really reasonable! (scroll to the bottom to see it)

Maybe marketing is all about “make your things discoverable without being annoying”? :)

### that’s all!

Hopefully some of this was interesting! Obviously the most important thing in all of this is to make cool things that are useful to people, but I think cool useful writing does not actually sell itself!

If you have thoughts about what kinds of marketing have worked well for you / you’ve felt good about I would love to hear them!

### Some nonparametric statistics math

I’m trying to understand nonparametric statistics a little more formally. This post may not be that intelligible because I’m still pretty confused about nonparametric statistics, there is a lot of math, and I make no attempt to explain any of the math notation. I’m working towards being able to explain this stuff in a much more accessible way but first I would like to understand some of the math!

There’s some MathJax in this post so the math may or may not render in an RSS reader.

Some questions I’m interested in:

• what is nonparametric statistics exactly?
• what guarantees can we make? are there formulas we can use?
• why do methods like the bootstrap method work?

since these notes are from reading a math book and math books are extremely dense this is basically going to be “I read 7 pages of this math book and here are some points I’m confused about”

### what’s nonparametric statistics?

Today I’m looking at “all of nonparametric statistics” by Larry Wasserman. He defines nonparametric inference as:

a set of modern statistical methods that aim to keep the number of underlying assumptions as weak as possible

Basically my interpretation of this is that – instead of assuming that your data comes from a specific family of distributions (like the normal distribution) and then trying to estimate the paramters of that distribution, you don’t make many assumptions about the distribution (“this is just some data!!“). Not having to make assumptions is nice!

There aren’t no assumptions though – he says

we assume that the distribution $F$ lies in some set $\mathfrak{F}$ called a statistical model. For example, when estimating a density $f$, we might assume that $$f \in \mathfrak{F} = \left\{ g : \int(g^{\prime\prime}(x))^2dx \leq c^2 \right\}$$ which is the set of densities that are not “too wiggly”.

I have not too much intuition for the condition $\int(g^{\prime\prime}(x))^2dx \leq c^2$. I calculated that integral for the normal distribution on wolfram alpha and got 4, which is a good start. (4 is not infinity!)

• what’s an example of a probability density function that doesn’t satisfy that $\int(g^{\prime\prime}(x))^2dx \leq c^2$ condition? (probably something with an infinite number of tiny wiggles, and I don’t think any distribution i’m interested in in practice would have an infinite number of tiny wiggles?)
• why does the density function being “too wiggly” cause problems for nonparametric inference? very unclear as yet.

### we still have to assume independence

One assumption we won’t get away from is that the samples in the data we’re dealing with are independent. Often data in the real world actually isn’t really independent, but I think the what people do a lot of the time is to make a good effort at something approaching independence and then close your eyes and pretend it is?

### estimating the density function

Okay! Here’s a useful section! Let’s say that I have 100,000 data points from a distribution. I can draw a histogram like this of those data points:

If I have 100,000 data points, it’s pretty likely that that histogram is pretty close to the actual distribution. But this is math, so we should be able to make that statement precise, right?

For example suppose that 5% of the points in my sample are more than 100. Is the probability that a point is greater than 100 actually 0.05? The book gives a nice formula for this:

$$\mathbb{P}(|\widehat{P}_n(A) - P(A)| > \epsilon ) \leq 2e^{-2n\epsilon^2}$$

(by “Hoeffding’s inequality” which I’ve never heard of before). Fun aside about that inequality: here’s a nice jupyter notebook by henry wallace using it to identify the most common Boggle words.

here, in our example:

• n is 1000 (the number of data points we have)
• $A$ is the set of points more than 100
• $\widehat{P}_n(A)$ is the empirical probability that a point is more than 100 (0.05)
• $P(A)$ is the actual probability
• $\epsilon$ is how certain we want to be that we’re right

So, what’s the probability that the real probability is between 0.04 and 0.06? $\epsilon = 0.01$, so it’s $2e^{-2 \times 100,000 \times (0.01)^2} = 4e^{-9}$ ish (according to wolfram alpha)

here is a table of how sure we can be:

• 100,000 data points: 4e-9 (TOTALLY CERTAIN that 4% - 6% of points are more than 100)
• 10,000 data points: 0.27 (27% probability that we’re wrong! that’s… not bad?)
• 1,000 data points: 1.6 (we know the probability we’re wrong is less than.. 160%? that’s not good!)
• 100 data points: lol

so basically, in this case, using this formula: 100,000 data points is AMAZING, 10,000 data points is pretty good, and 1,000 is much less useful. If we have 1000 data points and we see that 5% of them are more than 100, we DEFINITELY CANNOT CONCLUDE that 4% to 6% of points are more than 100. But (using the same formula) we can use $\epsilon = 0.04$ and conclude that with 92% probability 1% to 9% of points are more than 100. So we can still learn some stuff from 1000 data points!

This intuitively feels pretty reasonable to me – like it makes sense to me that if you have NO IDEA what your distribution that with 100,000 points you’d be able to make quite strong inferences, and that with 1000 you can do a lot less!

### more data points are exponentially better?

One thing that I think is really cool about this estimating the density function formula is that how sure you can be of your inferences scales exponentially with the size of your dataset (this is the $e^{-n\epsilon^2}$). And also exponentially with the square of how sure you want to be (so wanting to be sure within 0.01 is VERY DIFFERENT than within 0.04). So 100,000 data points isn’t 10x better than 10,000 data points, it’s actually like 10000000000000x better.

Is that true in other places? If so that seems like a super useful intuition! I still feel pretty uncertain about this, but having some basic intuition about “how much more useful is 10,000 data points than 1,000 data points?“) feels like a really good thing.

### some math about the bootstrap

The next chapter is about the bootstrap! Basically the way the bootstrap works is:

1. you want to estimate some statistic (like the median) of your distribution
2. the bootstrap lets you get an estimate and also the variance of that estimate
3. you do this by repeatedly sampling with replacement from your data and then calculating the statistic you want (like the median) on your samples

I’m not going to go too much into how to implement the bootstrap method because it’s explained in a lot of place on the internet. Let’s talk about the math!

I think in order to say anything meaningful about bootstrap estimates I need to learn a new term: a consistent estimator.

### What’s a consistent estimator?

Wikipedia says:

In statistics, a consistent estimator or asymptotically consistent estimator is an estimator — a rule for computing estimates of a parameter $\theta_0$ — having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to $\theta_0$.

This includes some terms where I forget what they mean (what’s “converges in probability” again?). But this seems like a very good thing! If I’m estimating some parameter (like the median), I would DEFINITELY LIKE IT TO BE TRUE that if I do it with an infinite amount of data then my estimate works. An estimator that is not consistent does not sound very useful!

### why/when are bootstrap estimators consistent?

spoiler: I have no idea. The book says the following:

Consistency of the boostrap can now be expressed as follows.

3.19 Theorem. Suppose that $\mathbb{E}(X_1^2) < \infty$. Let $T_n = g(\overline{X}_n)$ where $g$ is continuously differentiable at $\mu = \mathbb{E}(X_1)$ and that $g\prime(\mu) \neq 0$. Then,

$$\sup_u | \mathbb{P}_{\widehat{F}_n} \left( \sqrt{n} (T( \widehat{F}_n*) - T( \widehat{F}_n) \leq u \right) - \mathbb{P}_{\widehat{F}} \left( \sqrt{n} (T( \widehat{F}_n) - T( \widehat{F}) \leq u \right) | \rightarrow^\text{a.s.} 0$$

3.21 Theorem. Suppose that $T(F)$ is Hadamard differentiable with respect to $d(F,G)= sup_x|F(x)-G(x)|$ and that $0 < \int L^2_F(x) dF(x) < \infty$. Then,

$$\sup_u | \mathbb{P}_{\widehat{F}_n} \left( \sqrt{n} (T( \widehat{F}_n*) - T( \widehat{F}_n) \leq u \right) - \mathbb{P}_{\widehat{F}} \left( \sqrt{n} (T( \widehat{F}_n) - T( \widehat{F}) \leq u \right) | \rightarrow^\text{P} 0$$

things I understand about these theorems:

• the two formulas they’re concluding are the same, except I think one is about convergence “almost surely” and one about “convergence in probability”. I don’t remember what either of those mean.
• I think for our purposes of doing Regular Boring Things we can replace “Hadamard differentiable” with “differentiable”
• I think they don’t actually show the consistency of the bootstrap, they’re actually about consistency of the bootstrap confidence interval estimate (which is a different thing)

I don’t really understand how they’re related to consistency, and in particular the $\sup_u$ thing is weird, like if you’re looking at $\mathbb{P}(something < u)$, wouldn’t you want to minimize $u$ and not maximize it? Maybe it’s a typo and it should be $\inf_u$?

it concludes:

there is a tendency to treat the bootstrap as a panacea for all problems. But the bootstrap requires regularity conditions to yield valid answers. It should not be applied blindly.

### this book does not seem to explain why the bootstrap is consistent

In the appendix (3.7) it gives a sketch of a proof for showing that estimating the median using the bootstrap is consistent. I don’t think this book actually gives a proof anywhere that bootstrap estimates in general are consistent, which was pretty surprising to me. It gives a bunch of references to papers. Though I guess bootstrap confidence intervals are the most important thing?

### that’s all for now

This is all extremely stream of consciousness and I only spent 2 hours trying to work through this, but some things I think I learned in the last couple hours are:

1. maybe having more data is exponentially better? (is this true??)
2. “consistency” of an estimator is a thing, not all estimators are consistent
3. understanding when/why nonparametric bootstrap estimators are consistent in general might be very hard (the proof that the bootstrap median estimator is consistent already seems very complicated!)
4. boostrap confidence intervals are not the same thing as bootstrap estimators. Maybe I’ll learn the difference next!

### 2018: Year in review

I wrote these in 2015 and 2016 and 2017 and it’s always interesting to look back at them, so here’s a summary of what went on in my side projects in 2018.

### ruby profiler!

At the beginning of this year I wrote rbspy (docs: https://rbspy.github.io/). It inspired a Python version called py-spy and a PHP profiler called phpspy, both of which are excellent. I think py-spy in particular is probably better than rbspy which makes me really happy.

Writing a program that does something innovative (top for your Ruby program’s functions!) and inspiring other people to make amazing new tools is something I’m really proud of.

A very surprising thing that happened in 2018 is that I started a business! This is the website: https://wizardzines.com/, and I sell programming zines.

It’s been astonishingly successful (it definitely made me enough money that I could have lived on just the revenue from the business this year), and I’m really grateful to everyone’s who’s supported that work. I hope the zines have helped you. I always thought that it was impossible to make anywhere near as much money teaching people useful things as I can as a software developer, and now I think that’s not true. I don’t think that I’d want to make that switch (I like working as a programmer!), but now I actually think that if I was serious about it and was interested in working on my business skills, I could probably make it work.

I don’t really know what’s next, but I plan to write at least one zine next year. I learned a few things about business this year, mainly from:

I used to think that sales / marketing had to be gross, but reading some of these business books made me think that it’s actually possible to run a business by being honest & just building good things.

### work!

this is mostly about side projects, but a few things about work:

• I still have the same manager (jay). He’s been really great to work with. The help! i have a manager! zine is secretly largely things I learned from working with him.
• my team made some big networking infrastructure changes and it went pretty well. I learned a lot about proxies/TLS and a little bit about C++.
• I mentored another intern, and the intern I mentored last year joined us full time!

When I go back to work I’m going to switch to working on something COMPLETELY DIFFERENT (writing code that sends messages to banks!) for 3 months. It’s a lot closer to the company’s core business, and I think it’ll be neat to learn more about how financial infastracture works.

I struggled a bit with understanding/defining my job this year. I wrote What’s a senior engineer’s job? about that, but I have not yet reached enlightenment.

### talks!

I gave 4 talks in 2018:

• So you want to be a wizard at StarCon
• Building a Ruby profiler at the Recurse Center’s localhost series
• Build Impossible Programs in May at Deconstruct.
• High Reliability Infrastructure Migrations at Kubecon. I’m pretty happy about this talk because I’ve wanted to give a good talk about what I do at work for a long time and I think I finally succeeded. Previously when I gave talks about my work I think I fell into the trap of just describing what we do (“we do X Y Z” … “okay, so what?“). With this one, I think I was able to actually say things that were useful to other people.

In past years I’ve mostly given talks which can mostly be summarized “here are some cool tools” and “here is how to learn hard things”. This year I changed focus to giving talks about the actual work I do – there were two talks about building a Ruby profiler, and one about what I do at work (I spend a lot of time on infrastructure migrations!)

I’m not sure whether if I’ll give any talks in 2019. I travelled more than I wanted to in 2018, and to stay sane I ended up having to cancel on a talk I was planning to give with relatively short notice which wasn’t good.

### podcasts!

I also experimented a bit with a new format: the podcast! These were basically all really fun! They don’t take that long (about 2 hours total?).

what I learned about doing podcasts:

• It’s really important to give the hosts a list of good questions to ask, and to be prepared to give good answers to those questions! I’m not a super polished podcast guest.
• you need a good microphone. At least one of these people told me I actually couldn’t be on their podcast unless I had a good enough microphone, so I bought a medium fancy microphone. It wasn’t too expensive and it’s nice to have a better quality microphone! Maybe I will use it more to record audio/video at some point!

### !!Con

I co-organized !!Con for the 4th time – I ran sponsorships. It’s always such a delight and the speakers are so great.

!!Con is expanding to the west coast in 2019 – I’m not directly involved with that but it’s going to be amazing.

### blog posts!

I apparently wrote 54 blog posts in 2018. A couple of my favourites are What’s a senior engineer’s job? , How to teach yourself hard things, and batch editing files with ed.

There were basically 4 themes in blogging for 2018:

• progress on the rbspy project while I was working on it (this category)
• computer networking / infrastructure engineering (basically all I did at work this year was networking, though I didn’t write about it as much as I might have)
• musings about zines / business / developer education, for instance why sell zines? and who pays to educate developers?
• a few of the usual “how do you learn things” / “how do you succeed at your job” posts as I figure things about about that, for instance working remotely, 4 years in

### a tiny inclusion project: a guide to performance reviews

Last year in addition to my actual job, I did a couple of projects at work towards helping make sure the performance/promotion process works well for folks – i collaborated with the amazing karla on the idea of a “brag document”, and redid our engineering levels.

This year, in the same vein, I wrote a document called the “Unofficial guide to the performance reviews”. A lot of folks said it helped them but probably it’s too early to celebrate. I think explaining to folks how the performance review process actually works and how to approach it is really valuable and I might try to publish a more general version here at some point.

I like that I work at a place where it’s possible/encouraged to do projects like this. I spend a relatively small amount of time on them (maybe I spent 15 hours on this one?) but it feels good to be able to make tiny steps towards building a better workplace from time to time. It’s really hard to judge the results though!

### conclusions?

some things that worked in 2018:

• setting boundaries around what my job is
• doing open source work while being paid for it
• doing small inclusion projects at work
• writing zines is very time consuming but I feel happy about the time I spent on that
• blogging is always great

### New talk: High Reliability Infrastructure Migrations

On Tuesday I gave a talk at KubeCon called High Reliability Infrastructure Migrations. The abstract was:

For companies with high availability requirements (99.99% uptime or higher), running new software in production comes with a lot of risks. But it’s possible to make significant infrastructure changes while maintaining the availability your customers expect! I’ll give you a toolbox for derisking migrations and making infrastructure changes with confidence, with examples from our Kubernetes & Envoy experience at Stripe.

## video

### slides

Here are the slides:

since everyone always asks, I drew them in the Notability app on an iPad. I do this because it’s faster than trying to use regular slides software and I can make better slides.

## a few notes

Here are a few links & notes about things I mentioned in the talk

### skycfg: write functions, not YAML

I talked about how my team is working on non-YAML interfaces for configuring Kubernetes. The demo is at skycfg.fun, and it’s on GitHub here. It’s based on Starlark, a configuration language that’s a subset of Python.

My coworker John has promised that he’ll write a blog post about it at some point, and I’m hoping that’s coming soon :)

### no haunted forests

I mentioned a deploy system rewrite we did. John has a great blog post about when rewrites are a good idea and how he approached that rewrite called no haunted forests.

### ignore most kubernetes ecosystem software

One small point that I made in the talk was that on my team we ignore almost all software in the Kubernetes ecosystem so that we can focus on a few core pieces (Kubernetes & Envoy, plus some small things like kiam). I wanted to mention this because I think often in Kubernetes land it can seem like everyone is using Cool New Things (helm! istio! knative! eep!). I’m sure those projects are great but I find it much simpler to stay focused on the basics and I wanted people to know that it’s okay to do that if that’s what works for your company.

I think the reality is that actually a lot of folks are still trying to work out how to use this new software in a reliable and secure way.

### other talks

I haven’t watched other Kubecon talks yet, but here are 2 links:

I heard good things about this keynote from melanie cebula about kubernetes at airbnb, and I’m excited to see this talk about kubernetes security. The slides from that security talk look useful

Also I’m very excited to see Kelsey Hightower’s keynote as always, but that recording isn’t up yet. If you have other Kubecon talks to recommend I’d love to know what they are.

### my first work talk I’m happy with

I usually give talks about debugging tools, or side projects, or how I approach my job at a high level – not on the actual work that I do at my job. What I talked about in this talk is basically what I’ve been learning how to do at work for the last ~2 years. Figuring out how to make big infrastructure changes safely took me a long time (and I’m not done!), and so I hope this talk helps other folks do the same thing.

### How do you document a tech project with comics?

Every so often I get email from people saying basically “hey julia! we have an open source project! we’d like to use comics / zines / art to document our project! Can we hire you?“.

spoiler: the answer is “no, you can’t hire me” – I don’t do commissions. But I do think this is a cool idea and I’ve often wished I had something more useful to say to people than “no”, so if you’re interested in this, here are some ideas about how to accomplish it!

### zine != drawing

First, a terminology distinction. One weird thing I’ve noticed is that people frequently refer to individual tech drawings as “zines”. I think this is due to me communicating poorly somehow, but – drawings are not zines! A zine is a printed booklet, like a small magazine. You wouldn’t call a photo of a model in Vogue a magazine! The magazine has like a million pages! An individual drawing is a drawing/comic/graphic/whatever. Just clarifying this because I think it causes a bit of unnecessary confusion.

### comics without good information are useless

Usually when folks ask me “hey, could we make a comic explaining X”, it doesn’t seem like they have a clear idea of what information exactly they want to get across, they just have a vague idea that maybe it would be cool to draw some comics. This makes sense – figuring out what information would be useful to tell people is very hard!! It’s 80% of what I spend my time on when making comics.

You should think about comics the same way as any kind of documentation – start with the information you want to convey, who your target audience is, and how you want to distribute it (twitter? on your website? in person?), and figure out how to illustrate it after :). The information is the main thing, not the art!

Once you have a clear story about what you want to get across, you can start trying to think about how to represent it using illustrations!

### focus on concepts that don’t change

Drawing comics is a much bigger investment than writing documentation (it takes me like 5x longer to convey the same information in a comic than in writing). So use it wisely! Because it’s not that easy to edit, if you’re going to make something a comic you want to focus on concepts that are very unlikely to change. So talk about the core ideas in your project instead of the exact command line arguments it takes!

Here are a couple of options for how you could use comics/illustrations to document your project!

### option 1: a single graphic

One format you might want to try is a single, small graphic explaining what your project is about and why folks might be interested in it. For example: this zulip comic

This is a short thing, you could post it on Twitter or print it as a pamphlet to give out. The information content here would probably be basically what’s on your project homepage, but presented in a more fun/exciting way :)

You can put a pretty small amount of information in a single comic. With that Zulip comic, the things I picked out were:

• zulip is sort of like slack, but it has threads
• it’s easy to keep track of threads even if the conversation takes place over several days
• you can much more easily selectively catch up with Zulip
• zulip is open source
• there’s an open zulip server you can try out

That’s not a lot of information! It’s 50 words :). So to do this effectively you need to distill your project down to 50 words in a way that’s still useful. It’s not easy!

### option 2: many comics

Another approach you can take is to make a more in depth comic / illustration, like google’s guide to kubernetes or the children’s illustrated guide to kubernetes.

To do this, you need a much stronger concept than “uh, I want to explain our project” – you want to have a clear target audience in mind! For example, if I were drawing a set of Docker comics, I’d probably focus on folks who want to use Docker in production. so I’d want to discuss:

• publishing your containers to a public/private registry
• some best practices for tagging your containers
• how to use layers to save on disk space / download less stuff
• whether it’s reasonable to run the same containers in production & in dev

That’s totally different from the set of comics I’d write for folks who just want to use Docker to develop locally!

### option 3: a printed zine

The main thing that differentiates this from “many comics” is that zines are printed! Because of that, for this to make sense you need to have a place to give out the printed copies! Maybe you’re going present your project at a major conference? Maybe you give workshops about your project and want to give our the zine to folks in the workshop as notes? Maybe you want to mail it to people?

There are basically 3 ways to hire someone:

1. Hire someone who both understands (or can quickly learn) the technology you want to document and can illustrate well. These folks are tricky to find and probably expensive (I certainly wouldn’t do a project like this for less than 10,000 even if I did do commissions), just because programmers can usually charge a pretty high consulting rate. I’d guess that the main failure mode here is that it might be impossible/very hard to find someone, and it might be expensive. 2. Collaborate with an illustrator to draw it for you. The main failure mode here is that if you don’t give the illustrator clear explanations of your tech to work with, you.. won’t end up with a clear and useful explanation. From what I’ve seen, most folks underinvest in writing clear explanations for their illustrators – I’ve seen a few really adorable tech comics that I don’t find useful or clear at all. I’d love to see more people do a better job of this. What’s the point of having an adorable illustration if it doesn’t teach anyone anything? :) 3. Draw it yourself :). This is what I do, obviously. stick figures are okay! Most people seem to use method #2 – I’m not actually aware of any tech folks who have done commissioned comics (though I’m sure it’s happened!). I think method #2 is a great option and I’d love to see more folks do it. Paying illustrators is really fun! ### An example of how C++ destructors are useful in Envoy For a while now I’ve been working with a C++ project (Envoy), and sometimes I need to contribute to it, so my C++ skills have gone from “nonexistent” to “really minimal”. I’ve learned what an initializer list is and that a method starting with ~ is a destructor. I almost know what an lvalue and an rvalue are but not quite. But the other day when writing some C++ code I figured out something exciting about how to use destructors that I hadn’t realized! (the tl;dr of this post for people who know C++ is “julia finally understands what RAII is and that it is useful” :)) ### what’s a destructor? C++ has objects. When an C++ object goes out of scope, the compiler inserts a call to its destructor. So if you have some code like function do_thing() { Thing x{}; // this calls the Thing constructor return 2; }  there will be a call to x’s destructor at the end of the do_thing function. so the code c++ generates looks something like: • make new thing • call the new thing’s destructor • return 2 Obviously destructors are way more complicated like this. They need to get called when there are exceptions! And sometimes they get called manually. And for lots of other reasons too. But there are 10 million things to know about C++ and that is not what we’re doing today, we are just talking about one thing. ### what happens in a destructor? A lot of the time memory gets freed, which is how you avoid having memory leaks. But that’s not what we’re talking about in this post! We are talking about something more interesting. ### the thing we’re interested in: Envoy circuit breakers So I’ve been working with Envoy a lot. 3 second Envoy refresher: it’s a HTTP proxy, your application makes requests to Envoy, which then proxies the request to the servers the application wants to talk to. One very useful feature Envoy has is this thing called “circuit breakers”. Basically the idea with is that if your application makes 50 billion connections to a service, that will probably overwhelm the service. So Envoy keeps track how many TCP connections you’ve made to a service, and will stop you from making new requests if you hit the limit. The default max_connection limit ### how do you track connection count? To maintain a circuit breaker on the number of TCP connections, that means you need to keep an accurate count of how many TCP connections are currently open! How do you do that? Well, the way it works is to maintain a connections counter and: • every time a connection is opened, increment the counter • every time a connection is destroyed (because of a reset / timeout / whatever), decrement the counter • when creating a new connection, check that the connections counter is not over the limit that’s all! And incrementing the counter when creating a new connection is pretty easy. But how do you make sure that the counter gets decremented wheh the connection is destroyed? Connections can be destroyed in a lot of ways (they can time out! they can be closed by Envoy! they can be closed by the server! maybe something else I haven’t thought of could happen!) and it seems very easy to accidentally miss a way of closing them. ### destructors to the rescue The way Envoy solves this problem is to create a connection object (called ActiveClient in the HTTP connection pool) for every connection. Then it: • increments the counter in the constructor (code) • decrements the counter in the destructor (code) • checks the counter when a new connection is created (code) The beauty of this is that now you don’t need to make sure that the counter gets decremented in all the right places, you now just need to organize your code so that the ActiveClient object’s destructor gets called when the connection has closed. Where does the ActiveClient destructor get called in Envoy? Well, Envoy maintains 2 lists of clients (ready_clients and busy_clients), and when a connection gets closed, Envoy removes the client from those lists. And when it does that, it doesn’t need to do any extra cleanup!! In C++, anytime a object is removed from a list, its destructor is called. So client.removeFromList(ready_clients_); takes care of all the cleanup. And there’s no chance of forgetting to decrement the counter!! It will definitely always happen unless you accidentally leave the object on one of these lists, which would be a bug anyway because the connection is closed :) ### RAII This pattern Envoy is using here is an extremely common C++ programming pattern called “resource acquisition is initialization”. I find that name very confusing but that’s what it’s called. basically the way it works is: • identify a resource (like “connection”) where a lot of things need to happen when the connection is initialized / finished • make a class for that connection • put all the initialization / finishing code in the constructor / destructor • make sure the object’s destructor method gets called when appropriate! (by removing it from a vector / having it go out of scope) Previously I knew about using this pattern for kind of obvious things (make sure all the memory gets freed in the destructor, or make sure file descriptors get closed). But I didn’t realize it was also useful for cases that are slightly less obviously a resource like “decrement a counter”. The reason this pattern works is because the C++ compiler/standard library does a bunch of work to make sure that destructors get called when you’re done with an object – the compiler inserts destructor calls at the end of each block of code, after exceptions, and many standard library collections make sure destructors are called when you remove an object from a collection. ### RAII gives you prompt, deterministic, and hard-to-screw-up cleanup of resources The exciting thing here is that this programming pattern gives you a way to schedule cleaning up resources that’s: • easy to ensure always happens (when the object goes away, it always happens, even if there was an exception!) • prompt & determinstic (it happens right away and it’s guaranteed to happen!) ### what languages have RAII? C++ and Rust have RAII. Probably other languages too. Java, Python, Go, and garbage collected languages in general do not. In a garbage collected language you can often set up destructors to be run when the object is GC’d. But often (like in this case, which the connection count) you want things to be cleaned up right away when the object is no longer in use, not some indeterminate period later whenever GC happens to run. Python context managers are a related idea, you could do something like: with conn_pool.connection() as conn: do stuff  ### that’s all for now! Hopefully this explanation of RAII is interesting and mostly correct. Thanks to Kamal for clarifying some RAII things for me! ### Some notes on running new software in production I’m working on a talk for kubecon in December! One of the points I want to get across is the amount of time/investment it takes to use new software in production without causing really serious incidents, and what that’s looked like for us in our use of Kubernetes. To start out, this post isn’t blanket advice. There are lots of times when it’s totally fine to just use software and not worry about how it works exactly. So let’s start by talking about when it’s important to invest. ### when it matters: 99.99% If you’re running a service with a low SLO like 99% I don’t think it matters that much to understand the software you run in production. You can be down for like 2 hours a month! If something goes wrong, just fix it and it’s fine. At 99.99%, it’s different. That’s 45 minutes / year of downtime, and if you find out about a serious issue for the first time in production it could easily take you 20 minutes or to revert the change. That’s half your uptime budget for the year! ### when it matters: software that you’re using heavily Also, even if you’re running a service with a 99.99% SLO, it’s impossible to develop a super deep understanding of every single piece of software you’re using. For example, a web service might use: • 100 library dependencies • the filesystem (so there’s linux filesystem code!) • the network (linux networking code!) • a database (like postgres) • a proxy (like nginx/haproxy) If you’re only reading like 2 files from disk, you don’t need to do a super deep dive into Linux filesystems internals, you can just read the file from disk. What I try to do in practice is identify the components which we rely on the (or have the most unusual use cases for!), and invest time into understanding those. These are usually pretty easy to identify because they’re the ones which will cause the most problems :) ### when it matters: new software Understanding your software especially matters for newer/less mature software projects, because it’s morely likely to have bugs & or just not have matured enough to be used by most people without having to worry. I’ve spent a bunch of time recently with Kubernetes/Envoy which are both relatively new projects, and neither of those are remotely in the category of “oh, it’ll just work, don’t worry about it”. I’ve spent many hours debugging weird surprising edge cases with both of them and learning how to configure them in the right way. ### a playbook for understanding your software The playbook for understanding the software you run in production is pretty simple. Here it is: 1. Start using it in production in a non-critical capacity (by sending a small percentage of traffic to it, on a less critical service, etc) 2. Let that bake for a few weeks. 3. Run into problems. 4. Fix the problems. Go to step 3. Repeat until you feel like you have a good handle on this software’s failure modes and are comfortable running it in a more critical capacity. Let’s talk about that in a little more detail, though: ### what running into bugs looks like For example, I’ve been spending a lot of time with Envoy in the last year. Some of the issues we’ve seen along the way are: (in no particular order) • One of the default settings resulted in retry & timeout headers not being respected • Envoy (as a client) doesn’t support TLS session resumption, so servers with a large amount of Envoy clients get DDOSed by TLS handshakes • Envoy’s active healthchecking means that you services get healthchecked by every client. This is mostly okay but (again) services with many clients can get overwhelmed by it. • Having every client independently healthcheck every server interacts somewhat poorly with services which are under heavy load, and can exacerbate performance issues by removing up-but-slow clients from the load balancer rotation. • Envoy doesn’t retry failed connections by default • it frequently segfaults when given incorrect configuration • various issues with it segfaulting because of resource leaks / memory safety issues • hosts running out of disk space between we didn’t rotate Envoy log files often enough A lot of these aren’t bugs – they’re just cases where what we expected the default configuration to do one thing, and it did another thing. This happens all the time, and it can result in really serious incidents. Figuring out how to configure a complicated piece of software appropriately takes a lot of time, and you just have to account for that. And Envoy is great software! The maintainers are incredibly responsive, they fix bugs quickly and its performance is good. It’s overall been quite stable and it’s done well in production. But just because something is great software doesn’t mean you won’t also run into 10 or 20 relatively serious issues along the way that need to be addressed in one way or another. And it’s helpful to understand those issues before putting the software in a really critical place. ### try to have each incident only once My view is that running new software in production inevitably results in incidents. The trick: 1. Make sure the incidents aren’t too serious (by making ‘production’ a less critical system first) 2. Whenever there’s an incident (even if it’s not that serious!!!), spend the time necessary to understand exactly why it happened and how to make sure it doesn’t happen again My experience so far has been that it’s actually relatively possible to pull off “have every incident only once”. When we investigate issues and implement remediations, usually that issue never comes back. The remediation can either be: • a configuration change • reporting a bug upstream and either fixing it ourselves or waiting for a fix • a workaround (“this software doesn’t work with 10,000 clients? ok, we just won’t use it with in cases where there are that many clients for now!“, “oh, a memory leak? let’s just restart it every hour”) Knowledge-sharing is really important here too – it’s always unfortunate when one person finds an incident in production, fixes it, but doesn’t explain the issue to the rest of the team so somebody else ends up causing the same incident again later because they didn’t hear about the original incident. ### Understand what is ok to break and isn’t Another huge part of understanding the software I run in production is understanding which parts are OK to break (aka “if this breaks, it won’t result in a production incident”) and which aren’t. This lets me focus: I can put big boxes around some components and decide “ok, if this breaks it doesn’t matter, so I won’t pay super close attention to it”. For example, with Kubernetes: ok to break: • any stateless control plane component can crash or be cycled out or go down for 5 minutes at any time. If we had 95% uptime for the kubernetes control plane that would probably be fine, it just needs to be working most of the time. • kubernetes networking (the system where you give every pod an IP addresses) can break as much as it wants because we decided not to use it to start not ok: • for us, if etcd goes down for 10 minutes, that’s ok. If it goes down for 2 hours, it’s not • containers not starting or crashing on startup (iam issues, docker not starting containers, bugs in the scheduler, bugs in other controllers) is serious and needs to be looked at immediately • containers not having access to the resources they need (because of permissions issues, etc) • pods being terminated unexpectedly by Kubernetes (if you configure kubernetes wrong it can terminate your pods!) with Envoy, the breakdown is pretty different: ok to break: • if the envoy control plane goes down for 5 minutes, that’s fine (it’ll keep working with stale data) • segfaults on startup due to configuration errors are sort of okay because they manifest so early and they’re unlikely to surprise us (if the segfault doesn’t happen the 1st time, it shouldn’t happen the 200th time) not ok: • Envoy crashes / segfaults are not good – if it crashes, network connections don’t happen • if the control server serves incorrect or incomplete data that’s extremely dangerous and can result in serious production incidents. (so downtime is fine, but serving incorrect data is not!) Neither of these lists are complete at all, but they’re examples of what I mean by “understand your sofware”. ### sharing ok to break / not ok lists is useful I think these “ok to break” / “not ok” lists are really useful to share, because even if they’re not 100% the same for every user, the lessons are pretty hard won. I’d be curious to hear about your breakdown of what kinds of failures are ok / not ok for software you’re using! Figuring out all the failure modes of a new piece of software and how they apply to your situation can take months. (this is is why when you ask your database team “hey can we just use NEW DATABASE” they look at you in such a pained way). So anything we can do to help other people learn faster is amazing ### Tailwind: style your site without writing any CSS! Hello! Over the last couple of days I put together a new website for my zines (https://wizardzines.com). To make this website, I needed to write HTML and CSS. Eep!! Web design really isn’t my strong suit. I’ve been writing mediocre HTML/CSS for probably like 12 years now, and since I don’t do it at all in my job and am making no efforts to improve, the chances of my mediocre CSS skills magically improving are… not good. But! I want to make websites sometimes, and It’s 2018! All websites need to be responsive! So even if I make a pretty minimalist site, it does need to at least sort of work on phones and tablets and desktops with lots of different screen sizes. I know about CSS and flexboxes and media queries, but in practice putting all of those things together is usually a huge pain. I ended up making this site with Tailwind CSS, and it helped me make a site I felt pretty happy with my minimal CSS skills and just 2 evenings of work! The Tailwind author wrote a blog post called CSS Utility Classes and “Separation of Concerns” which you should very possibly read instead of this :). ### CSS zen garden: change your CSS, not your HTML Until yesterday, what I believed about writing good CSS was living in about 2003 with the CSS zen garden. The CSS zen garden was (and is! it’s still up!) this site which was like “hey everyone!! you can use CSS to style your websites instead of HTML tables! Just write nice semantic HTML and then you can accomplish anything you need to do with CSS! This is amazing!” They show it off by providing lots of different designs for the site, which all use exactly the same HTML. It’s a really fun & creative thing and it obviously made an impression because I remember it like 10 years later. And it makes sense! The idea that you should write semantic HTML, kind of like this: div class="zen-resources" id="zen-resources"> <h3 class="resources">Resources:</h3>  and then style those classes. ### writing CSS is not actually working for me Even though I believe in this CSS zen garden semantic HTML ideal, I feel like writing CSS is not actually really working for me personally. I know some CSS basics – I know font-size and align and min-height and can even sort of use flexboxes and CSS grid. I can mostly center things. I made https://rbspy.github.io/ responsive by writing CSS. But I only write CSS probably every 4 months or something, and only for tiny personal sites, and in practive I always end up with some media query problem sadly googling “how do I center div” for the 500th time. And everything ends up kind of poorly aligned and eventually I get something that sort of works and hide under the bed. ### CSS frameworks where you don’t write CSS So! There’s this interesting thing that has happened where now there are CSS frameworks where you don’t actually write any CSS at all to use them! Instead, you just add lots of CSS classes to each element to style it. It’s basically the opposite of the CSS zen garden – you have a single CSS file that you don’t change, and then you use 10 billion classes in your HTML to style your site. Here’s an example from https://wizardzines.com/zines/manager/. This snippet puts images of the cover and the table of contents side by side. <div class="flex flex-row flex-wrap justify-center"> <div class="md:w-1/2 md:pr-4"> <img src='cover.png'> </div> <div class="md:w-1/2"> <a class="outline-none" href='/zines/manager/toc.png'> <img src='toc.png'> </a> </div> </div>  Basically the outside div is a flexbox – flex means display: flex, flex-row means flex-direction: row, etc. Most (all?) of the classes apply exactly 1 line of CSS. Here’s the ‘Buy’ Button: <a class="text-xl rounded bg-orange pt-1 pb-1 pr-4 pl-4 text-white hover:text-white no-underline leading-loose" href="https://gum.co/oh-shit-git">Buy for10</a>


The Buy button breaks down as:

• pt, pb, pr, pl are padding
• text-white, hover:text-white are the text color
• no-underline is text-decoration: none
• leading-loose sets line-height: 1.5

### why it’s fun: easy media queries

Tailwind does a really nice thing with media queries, where if you add a class lg:pl-4, it means “add padding, but only on screens that are ‘large’ or bigger.

I love this because it’s really easy to experiment and I don’t need to go hunt through my media queries to make something look better on a different screen size! For example, for that image example above, I wanted to make the images display side by side, but only on biggish screens. So I could just add the class md:w-1/2, which makes the width 50% on screens bigger than ‘medium’.

  <div class="md:w-1/2 md:pr-4">
<img src='cover.png'>
</div>


Basically there’s CSS in Tailwind something like:

@media screen and (min-width: 800px) {
.md:w-1/2 {
width: 50%;
}
}


I thought it was interesting that all of the Tailwind media queries seem to be expressed in terms of min-width instead of max-width. It seems to work out okay.

### why it’s fun: it’s fast to iterate!

Usually when I write CSS I try to add classes in a vaguely semantic way to my code, style them with CSS, realize I made the wrong classes, and eventually end up with weird divs with the id “WRAPPER-WRAPPER-THING” or something in a desperate attempt to make something centered.

It feels incredibly freeing to not have to give any of my divs styles or IDs at all and just focus on thinking about how they should look. I just have one kind of thing to edit! (the HTML). So if I want to add some padding on the left, I can just add a pl-2 class, and it’s done!

https://wizardzines.com/ has basically no CSS at all except for a single <link href="https://cdn.jsdelivr.net/npm/tailwindcss/dist/tailwind.min.css" rel="stylesheet">.

### why is this different from inline styles?

These CSS frameworks are a little weird because adding the no-underline class is literally the same as writing an inline text-decoration: none. So is this just basically equivalent to using inline CSS styles? It’s not! Here are a few extra features it has:

1. media queries. being able to specify alternate attributes depending on the size (sm:text-orange md:text-white) is awesome to be able to do so quickly
2. Limits & standards. With normal CSS, I can make any element any width I want. For me, this is not a good thing! With tailwind, there are only 30ish options for width, and I found that these limits made me way easier for me to make reasonable CSS choices that made my site look the way I wanted. No more width: 300px; /* i hope this looks okay i don't know help */ Here’s the colour palette! It forces you to do everything in em instead of using pixels which I understand is a Good Idea even though I never actually do it when writing CSS.

### why does it make sense to use CSS this way?

It seems like there are some other trends in web development that make this approach to CSS make more sense than it might have in, say, 2003.

I wonder if the reason this approach makes more sense now is that we’re doing more generation of HTML than we were in 2003. In my tiny example, this approach to CSS actually doesn’t introduce that much duplication into my site, because all of the HTML is generated by Hugo templates, so most styles only end up being specified once anyway. So even though I need to write this absurd text-xl rounded bg-orange pt-1 pb-1 pr-4 pl-4 text-white hover:text-white no-underline leading-loose set of classes to make a button, I only really need to write it once.

I’m not sure!

### other similar CSS frameworks

• tachyons
• bulma
• tailwind
• to some extent the much older bootstrap, though when I’ve used that I ultimately felt like all my sites looked exactly the same (“oh, another bootstrap site”), which made me stop using it.

There are probably lots more. I haven’t tried Tachyons or Bulma at all. They look nice too.

### utility-first, not utility-only

Tne thing the Tailwind author says that I think is interesting is that the goal of Tailwind is not actually for you to never write CSS (even though obviously you can get away with that for small sites). There’s some more about that in these HN comments.

### should everyone use this? no idea

I have no position on the One True Way to write (or not write) CSS. I’m not a frontend developer and you definitely should not take advice from me. But I found this a lot easier than just about everything I’ve tried previously, so maybe it will help you too.

### When does teaching with comics work well?

I’m speaking at Let’s sketch tech! in San Francisco in December. I’ve been thinking about what to talk about (the mechanics of making zines? how comics skills are different from drawing skills? the business of self-publishing?). So here’s one interesting question: in what situations does using comics to teach help?

### comics are kind of magic

The place I’m starting with is – comics often feel magical to me. I’ll post a comic on, for instance, /proc, and dozens of people will tell me “wow, I didn’t know this existed, this is so useful!“. It seems clear that explaining things with comics often works well for a lot of people. But it’s less clear which situations comics are useful in! So this post is an attempt to explore that.

### what’s up with “learning styles?”

One possible way to answer the question “when does using comics to teach work well?” is “well, some people are visual learners, and for those people comics work well”. This is based on the idea that different people have different “learning styles” and learn more effectively when taught using their preferred learning style.

It’s clear that different people have different learning preferences (for instance I like reading text and dislike watching videos). From my very brief reading of Wikipedia, it seems less clear that folks actually learn more effectively when taught using their preferences. So, whether or not this is true, it’s not how I think about what I’m doing here.

### learning preferences still matter

You could conclude from this that learning preferences don’t matter at all, and you should just teach any given concept in the best way for that concept. But!! I think learning preferences still matter, at least for me. I don’t teach in a classroom, I teach whoever feels like reading what I’m writing on the internet! And if people don’t feel like learning the things I’m teaching because of the way they’re presented, they won’t!

For example – I don’t watch videos to learn. (which is not to say that I’m incapable of learning from videos, just studies show I just don’t watch them). So if someone is teaching a lot of cool things I want to learn on YouTube, I won’t watch them!

So right now I’m reading statements like “I’m a visual learner” as a preference worth paying attention to :).

### when comics help: diagrams

A lot of the systems I work with involve a lot of interacting systems. For example, Kubernetes is a complicated system with many components. It took me months to understand how the components fit together. Eventually I understood that the answer is this diagram:

The point of this diagram is that all Kubernetes’ state lives in etcd, every other Kubernetes component decides what to do by making requests to the API server, and none of the components communicate with each other (or etcd) directly. Those are some of the most important things to know about Kubernetes’ architecture, which is why they’re in the diagram.

Not all diagrams are helpful though!! I’m going to pick on someone else’s kubernetes diagram (source), which is totally accurate but which I personally find less helpful.

I think the way this diagram (and a lot of diagrams!) are drawn is:

• identify the components of the system
• draw boxes for each component and arrows between components that communicate

This approach works well in a lot of contexts, but personally I find it often leaves me feeling confused about how the system works. Diagrams like this often don’t highlight the most important/unusual architectural decisions! The way I like to draw diagrams is, instead:

• figure out what the key architecture decision(s) are that folks need to understand to use it
• draw a diagram that illustrates those architecture decisions (possibly including boxes and arrows)
• leave out parts that aren’t key to understanding the architecture

So, for that kubernetes diagram, I left out pods and the role of the kubelet and where any of these components are running (on a master? on a worker?), because even those those are very important, they weren’t my teaching goals for the diagram.

### when comics help: explaining scenarios

Something I find really effective is to quickly explain a few important things about something that’s really complicated like “how to run kubernetes” or “how distributed systems work”.

Often when trying to explain a huge topic, people start with generalities (“let me explain what a linearizable system is!“). I have another approach that I prefer, which I think of as the “scenes from” approach, or “get specific!”. (which is the same as the best way to give a lightning talk – explain one specific interesting thing instead of trying to give an overview).

The idea is to zoom into a common specific scenario that you’ll run into in real life. For example, a really common situation when using a linearizable distributed system is that it’ll periodically become unavailable due to a leader election. I didn’t know that that was commmon when I started working with distributed systems!! So just saying “hey, here is a thing that happens in practice” can be useful.

Here are 2 example comics I’ve done in this style:

Comics are a really good fit for illustrating scenarios like this because often there’s some kind of interaction! (“can’t you see we’re having a leader election??”)

### when comics help: writing a short structured list

I’ve gotten really into using comics to explain command line tools recently (eg the bite size command line zine).

One of my favorite comics from that zine is the grep comic. The reason I love this comic is that it literally includes every grep command line argument I’ve ever used, as well as a few I haven’t but that I think seem useful. And I’ve been using grep for 15 years! I think it’s amazing that it’s possible to usefully summarize grep in such a small space.

I think it’s important in this case that the list be structured – all of the things in this list are the same type (“grep command line arguments”). I think comics work well here just because your can make the list colourful / fun / visually appealing.

### when comics help: explaining a simple idea

I spent most of bite size linux explaining various Linux ideas. Here’s a pipes comic that I was pretty happy with! I think this is a little bit like “draw a diagram” – there are a few fundamental concepts about pipes that I think are useful to understand, specifically that pipes have a buffer and that writes to a pipe block if the buffer is full.

I think comics work well for this just because you can mix text and small diagrams really easily, and with something like pipes the tiny diagrams help a lot.

### that’s all for now

I don’t think this is the ‘right’ categorization of “when comics work for teaching” yet. But I think this is a somewhat accurate description of how I’ve been using them so far. If you have other thoughts about when comics work (and when they don’t!) I’d love to hear them.

### New zine: Oh shit, git!

Hello! Last week Katie Sylor-Miller and I released a new zine called “Oh shit, Git!”. It has a bunch of common git mistakes and how to fix them! I learned a surprising number of things by working on it (like what HEAD@{2} means, and that you can do my-branch-name@{2} to see what a branch was previously pointing to, and more ways to use git diff)

You can get it for $10 at Oh shit, git! or a swear-free version at Dangit, git!. Here’s the cover and table of contents: (you can click on the table of contents to make it bigger). ### why this zine? I’ve thought for a couple of years that it might be fun to write a git zine, but I had NO IDEA how to do it. I was in this weird place with git where, even though I know that git is really confusing, I felt like I’d forgotten what it was like to be confused/scared by Git. And I write most things from a place of “I was super confused by this thing just recently, let me explain it!!”. But then!! I saw that Katie Sylor-Miller had made this delightful website called oh shit, git! explaining how to get out of common git mishaps. I thought this was really brilliant because a lot of the things on that site (“oh shit, i committed to the wrong branch!“) are things I remember being really scary when I was less comfortable with git! So I thought, maybe this could be useful for folks to have as a paper reference! Maybe we could make a zine out of it! So I emailed her and she agreed to work with me. And now here it is! :D. Very excited to have done a first collaboration. ### what’s new in the oh shit, git! zine? The zine isn’t the same as the website – we decided we wanted to add some fundamental information about how Git works (what’s a commit?), because to really work with Git effectively you need to understand at least a little bit about how commits and branches work! And some of the explanations are improved. Probably about 50% of the material in the zine is from the website and 50% is new. ### a couple of example pages Here are a couple of example pages, to give you an idea of what’s in the zine: and a page on git reflog: ### that might be it for zines in 2018! I’m not sure, but I don’t think I’ll write any more zines for a couple of months. So far there have been 5 (!!!) this year – perf, bite size linux, bite size command line, help! I have a manager!, and this one!. I’m really happy with that number and very grateful to everyone who’s supported them. ideas I have for zines right now include: • kubernetes • how to do statistics using programming • ‘bite size networking’, on the 10 billion different command line tools used for different networking things • ‘bite size linux v2’, about more core linux concepts that i didn’t get to in ‘bite size linux’ There’s a definite tradeoff between writing zines and blogging, and writing blog posts is really fun. Maybe I’ll try going back in that direction for a little. ### Some Envoy basics Envoy is a newish network proxy/webserver in the same universe as HAProxy and nginx. When I first learned about it around last fall, I was pretty confused by it. There are a few kinds of questions one might have about any piece of software: • how does do you use it? • why is it useful? • how does it work internally? I’m going to spend most of my time in this post on “how do you use it?”, because I found a lot of the basics about how to configure Envoy very confusing when I started. I’ll explain some of the Envoy jargon that I was initially confused by (what’s an SDS? XDS? CDS? EDS? ADS? filter? cluster? listener? help!) There will also be a little bit of “why is it useful?” and nothing at all about the internals. ### What’s Envoy? Envoy is a network proxy. You compile it, you put it on the server that you want the, you tell it which configuration file to use it, and away you go! Here’s probably the simplest possible example of using Envoy. The configuration file is a gist. This example starts a webserver on port 7777 that proxies to another HTTP server on port 8000. If you have Docker, you can try it now – just download the configuration, start the Envoy docker image, and away you go! python -mSimpleHTTPServer & # Start a HTTP server on port 8000 wget https://gist.githubusercontent.com/jvns/340e4d20c83b16576c02efc08487ed54/raw/1ddc3038ed11c31ddc70be038fd23dddfa13f5d3/envoy_config.json docker run --rm --net host -v=$PWD:/config envoyproxy/envoy /usr/local/bin/envoy -c /config/envoy_config.json


This will start an Envoy HTTP server, and then you can make a request to Envoy! Just curl localhost:7777 and it’ll proxy the request to localhost:8000.

### Envoy basic concepts: clusters, listeners, routes, and filters

This small tiny envoy_config.json we just ran contains all the basic Envoy concepts!

First, there’s a listener. This tells Envoy to bind to a port, in this case 7777:

"listeners": [{


Next up, the listener has filters. Filters tell the listener what to do with the requests it receives, and you give Envoy an array of filters. If you’re doing something complicated typically you’ll apply several filters to every requests coming in.

There are a few different kinds of filters (see list of TCP filters), but the most important filter is probably the envoy.http_connection_manager filter, which is used for proxying HTTP requests. The HTTP connection manager has a further list of HTTP filters that it applies (see list of HTTP filters). The most important of those is the envoy.router filter which routes requests to the right backend.

In our example, here’s how we’ve configured our filters. There’s one TCP filter (envoy.http_connection_manager) which uses 1 HTTP filter (envoy.router)

"filters": [
{
"name": "envoy.http_connection_manager",
"config": {
"stat_prefix": "ingress_http",
"http_filters": [{ "name": "envoy.router", "config": {} }],
....


Next, let’s talk about routes. You’ll notice that so far we haven’t explained to the envoy.router filter what to do with the requests it receives. Where should it proxy them? What paths should it match? In our case, the answer to that question is going to be “proxy all requests to localhost:8000”.

The envoy.router filter is configured with an array of routes. Here’s how they’re configured in our test configuration. In our case there’s just one route.

"route_config": {
"virtual_hosts": [
{
"name": "blah",
"domains": "*",
"routes": [
{
"match": { "prefix": "/" },
"route": { "cluster": "banana" }


This gives a list of domains to match (these are matched against the requests Host header). If we changed "domains": "*" to "domains": "my.cool.service", then we’d need to pass the header Host: my.cool.service to get a response.

If you’re paying attention to the ongoing saga of this configuration, you’ll notice that the port 8000 hasn’t been mentioned anywhere. There’s just "cluster": "banana". What’s a cluster?

Well, a cluster is a collection of address (IP address / port) that are the backend for a service. For example, if you have 8 machines running a HTTP service, then you might have 8 hosts in your cluster. Every service needs its own cluster. This example cluster is really simple: it’s just a single IP/port, running on localhost.

  "clusters":[
{
"name": "banana",
"type": "STRICT_DNS",
"connect_timeout": "1s",
"hosts": [
]
}
]


### tips for writing Envoy configuration by hand

I find writing Envoy configurations from scratch pretty time consuming – there are some examples in the Envoy repository (https://github.com/envoyproxy/envoy), but even after using Envoy for a year this basic configuration actually took me 45 minutes to get right. Here are a few tips:

• Envoy has 2 different APIs: the v1 and the v2 API. Many newer features are only available in the v2 API, and I find its documentation a little easier to navigate because it’s automatically generated from protocol buffers. (eg the Cluster docs are generated from cds.proto)
• A few good starting points in the Envoy API docs: Listener, Cluster, Filter, Virtual Host. To get all the information you need you need to click a lot (for example to see how to configure the cluster for a route you need to start at “Virtual Host” and click route_config -> virtual_hosts -> routes -> route -> cluster), but it works.
• The architecture overview docs are useful and give an overall explanation of how some Envoy things are configured.
• You can use either json or yaml to configure Envoy. Above I’ve used JSON.

### You can configure Envoy with a server

Even though we started with a configuration file on disk, one thing that makes Envoy really different from HAProxy or nginx is that Envoy often isn’t configured with a configuration file. Instead, you can configure Envoy with one or several configuration servers which dynamically change your configuration.

To get an idea of why this might be useful: imagine that you’re using Envoy to load balance requests to 50ish backend servers, which are EC2 instances that you periodically rotate out. So http://your-website.com requests go to Envoy, and get routed to an Envoy cluster, which needs to be a list of the 50 IP addresses and ports of those servers.

But what if those servers change over time? Maybe you’re launching new ones or they’re getting terminated. You could handle this by periodically changing the Envoy configuration file and restarting Envoy. Or!! You could set up a “cluster discovery service” (or “CDS”), which for example could query the AWS API and return all the IPs of your backend servers to Envoy.

I’m not going to get into the details of how to configure a discovery service, but basically it looks like this (from this template). You tell it how often to refresh and what the address of the server is.

dynamic_resources:
cds_config:
api_config_source:
cluster_names:
- cds_cluster
refresh_delay: 30s
...
- name: cds_cluster
connect_timeout: 0.25s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
hosts:
protocol: TCP
port_value: 80


### 4 kinds of Envoy discovery services

There are 4 kinds of resources you can set up discovery services for Envoy – routes (“what cluster should requests with this HTTP header go to”), clusters (“what backends does this service have?”), listener (the filters for a port), and endpoints. These are called RDS, CDS, LDS, and EDS respectively. XDS is the overall protocol.

The easiest way to write a discovery service from scratch is probably in Go using the go-control-plane library.

### some Envoy discovery services

It’s definitely possible to write Envoy configuration services from scratch, but there are some other open source projects that implement Envoy discovery services. Here are the ones I know about, though I’m sure there are more:

• There’s an open source Envoy discovery service called rotor which looks interesting. The company that built it just shut down a couple weeks ago.
• Istio (as far as I understand it) is basically an Envoy discovery service that uses information from the Kubernetes API (eg the services in your cluster) to configure Envoy clusters/routes. It has its own configuration language.
• consul might be adding support for Envoy (see this blog post), though I don’t fully understand the status there

### what’s a service mesh?

Another term that I hear a lot is “service mesh”. Basically a “service mesh” is where you install Envoy on the same machine as every one of your applications, and proxy all your network requests through Envoy.

Basically it gives you more easily control how a bunch of different applications (maybe written in different programming languages) communicate with each other.

### why is Envoy interesting?

I think these discovery services are really the exciting thing about Envoy. If all of your network traffic is proxied through Envoy and you control all Envoy configuration from a central server, then you can potentially:

• use circuit breaking
• route requests to only close instances
• encrypt network traffic end-to-end
• run controlled code rollouts (want to send only 20% of traffic to the new server you spun up? okay!)

all without having to change any application code anywhere. Basically it’s a very powerful/flexible decentralized load balancer.

Obviously setting up a bunch of discovery services and operating them and using them to configure your internal network infrastructure in complicated ways is a lot more work than just “write an nginx configuration file and leave it alone”, and it’s probably more complexity than is appropriate for most people. I’m not going to venture into telling you who should or should not use Envoy, but my experience has been that, like Kubernetes, it’s both very powerful and very complicated.

One of the things I really like about Envoy is that you can pass it a HTTP header to tell it how to retry/timeout your requests!! This is amazing because implementing timeout / retry logic correctly works differently in every programming language and people get it wrong ALL THE TIME. So being able to just pass a header is great.

The timeout & retry headers are documented here, and here are my favourites:

• x-envoy-max-retries: how many times to retry
• x-envoy-retry-on: which failures to retry (eg 5xx or connect-failure)
• x-envoy-upstream-rq-timeout-ms: total timeout
• x-envoy-upstream-rq-per-try-timeout-ms: timeout per retry

### that’s all for now

I have a lot of thoughts about Envoy (too many to write in one blog post!), so maybe I’ll say more later!

### What's a senior engineer's job?

There’s this great post by John Allspaw called “On being a senior engineer”. I originally read it 4ish years ago when I started my current job and it really influenced how I thought about the direction I wanted to go in.

Rereading it 4 years later, one thing that’s really interesting to me about that blog post is that it’s explaining that empathy / helping your team succeed is an important part of being a senior engineer. Which of course is true!

But from where I stand today, most (all?) of the senior engineers I know take on a significant amount of helping-other-people work in addition to their individual programming work. The challenge I see me/my coworkers struggling with today isn’t so much “what?? I have to TALK TO PEOPLE?? UNBELIEVABLE.” and more “wait, how do I balance all of this leadership work with my individual contributions / programming work in a way that’s sustainable for me? How much of what kind of work should I be doing?“. So instead of talking about the attributes that a senior engineer has from Allspaw’s post (which I totally agree with), instead I want to talk here about the work that a senior engineer does.

### what this post is describing

“what a senior engineer does” is a huge topic and this is a small post. things to keep in mind when reading:

• this is just one possible description of what a “senior engineer” could do. There are a lot of ways to work and this isn’t intended to be definitive.
• I have basically only worked at one company and this is just about my experiences so my perspective is obviously pretty limited
• There are obviously a lot of levels of “senior engineer” out there. This is aimed somewhere around P3/P4 in the Mozilla ladder (senior engineer / staff engineer), maybe a bit more on the “staff” side.

### What’s part of the job

These are things that I view as being mostly a senior engineer’s job and less a manager’s job. (though managers definitely do some of this too, especially creating new projects / relating projects to business priorities)

The thing that holds all this together is that almost all of this work is fundamentally technical: helping someone get unstuck on a tricky project is obviously a human interaction, but the issues we’ll be working on together will generally be computer issues! (“maybe if we simplify this design we can be done with this way sooner!“)

• Write code. (obviously)
• Do code reviews. (obviously)
• Write and review design docs. As with other review tasks, I think of “review design docs” as “get a second set of eyes on it, which will probably help improve the design”.
• Help team members when they’re stuck. Sometimes folks get stuck on a project, and it’s important to work to support them! I think of this less as “parachute from the sky and deliver your magical knowledge to people” and more as “work together to understand the problem they’re trying to solve and see if 2 brains are better than 1” :). This also means working with someone to solve the problem instead of solving the problem for them.
• Hold folks to a high quality standard. “Quality” will mean different things for different folks (for my team it means reliability/security/usability). Usually when someone makes a decision that seems off to me, it’s either because I know something that they don’t or they know something I don’t! So instead of telling someone “hey you did this wrong you should do X instead”, I try to just give them some extra information that they didn’t have and often that sorts it out. And pretty often it turns out that I was missing something and actually their decision was totally reasonable! In the past I’ve very occasionally seen senior engineers try to enforce quality standards by repeating their opinions more and more loudly because they think their opinions are Right and I haven’t personally found that helpful.
• Create new projects. A software engineering team isn’t a zero-sum place! The best engineers I know don’t hoard the most interesting work for themselves, they create new interesting/important work and create space for folks to do that work. For example, someone on my team spearheaded a rewrite of our deployment system which was super successful and now there’s a whole team working on new features that are way easier to build post-rewrite!
• Plan your projects’ work. This is about writing down / communicating the roadmap for projects you’re working on and making sure that folks understand the plan.
• Proactively communicate project risks. It’s really important to recognize when something you’re working on isn’t going well, communicate it to other engineers/managers, and figure out what to do.
• Communicate successes!
• Do side projects that benefit the team/company. I see a lot of senior engineers occasionally doing small high leverage projects (like building dev tooling / helping set policies) that end up helping a LOT of people get their work done a lot better.
• Be aware of how projects relate to business priorities.
• Decide when to stop doing a project. Figuring out when to stop / not start work on something is surprisingly hard :)

I put “write code” first because I find it surprisingly easy to accidentally let that take a back seat :)

One thing I left out is “make estimates”. Making estimates is something I’m still not very good at and that I don’t think I see very much of (?), but I think it could be worth spending more time on some day.

This list feels like a lot and like if you tried to do all those things all the time it would consume all available brain space. I think in general it probably makes sense to carve out a subset and decide “right now I’m going to focus on X Y Z, I think my brain will explode if I try to do A B C as well”.

### What’s not part of the job

This section is a bit tricky. I’m not saying that these aren’t a senior engineer’s job in the sense of “I won’t help create a good work environment on my team, how dare you suggest that’s part of my job!!“. Most senior engineers I know have spent a huge amount of time thinking about these issues and work on them quite a bit.

The reason I think it’s useful to create a boundary here is that everyone I work with has a really strong sense of ownership/responsibility to the team / company (“does it need to be done? well, sure, I can do that!!“) and I think it’s easy for that willingness to do whatever needs to happen to turn into folks getting overwhelmed/overworked/unable to make the kinds of technical contributions that are actually their core job. So if you can create some boundaries around your role it’s easier to decide what sorts of work to ask for help with when things are hectic. The actual boundary you draw course depends on you / your team :)

Most of these are a manager’s job. Caveats: managers do a lot more than the things listed here (for instance “create new projects”), and at some companies some of these things might actually be the job of a senior engineer (eg sprint management).

• Make sure every team member’s work is recognized
• Make sure work is allocated in a fair way
• Make sure folks are working well together
• Build team cohesion
• Have 1:1s with everyone on the team
• Train new managers / help them understand what’s expected of them (though I think senior ICs often actually do end up picking some of this up?)
• Do project management for projects you’re not working on (where I work, that’s the job of whatever engineer is leading that project)
• Be a product manager
• Do sprint management / organize everyone’s work into milestones / run weekly team meetings

### Explicitly setting boundaries is useful

I ran into an interesting situation recently where I was talking to a manager about which things were and weren’t part of my job as an engineer, and we realized that we had very different expectations! We talked about it and I think it’s sorted out now, but it made me realize that it’s very important to agree about what the expectations are :)

When I started out as an engineer, my job was pretty straightforward – I wrote code, tried to come up with projects that made sense, and that was fine. My manager always had a clear sense of what my job was and it wasn’t too complicated. Now that’s less true! So now I view it as being more my responsibility to define a job that:

• I can do / is sustainable for me
• I want to do / that’s overall enjoyable & in line with my personal goals
• is valuable to the team/organization

And the exact shape of that job will be different for different people (not everyone has the same interests & strengths, for example I am actually not amazing at code review yet!), which I think makes it even more important to negotiate it / do expectation setting.

### Don’t agree to a job you can’t do / don’t want

I think pushing back if I’m asked to do work that I can’t do or that I think will make me unhappy long term is important! I find it kind of tempting to agree to take on a lot of work that I know I don’t really enjoy (“oh, it’s good for the team!”, “well someone needs to do it!“). But, while I obviously sometimes take on tasks just because they need to be done, I think it’s actually really important for team health for folks to be overall doing jobs that are sustainable for them and that they overall enjoy.

So I’ll take on small tasks that just need to get done, but I think it’s important for me not to say “oh sure, I’ll spend a large fraction of my time doing this thing that I’m bad at and that I dislike, no problem” :). And if “someone” needs to do it, maybe that just means we need to hire/train someone new to fill the gap :)

### I still have a lot to learn!

While I feel like I’m starting to understand what this “senior engineer” thing is all about (7 years into my career so far), I still feel like I have a LOT to learn about it and I’d be interested to hear how other people define the boundaries of their job!

### Some possible career goals

I was thinking about career goals a person could have (as a software developer) this morning, and it occurred to me that there are a lot of possible goals! So I asked folks on Twitter what some possible goals were and got a lot of answers.

This list intentionally has big goals and small goals, and goals in very different directions. It definitely does not attempt to tell you what sorts of goals you should have. I’m not sure yet whether it’s helpful or not but here it is just in case :)

I’ve separated them into some very rough categories. Also I feel like there’s a lot missing from this list still, and I’d be happy to hear what’s missing on twitter.

### technical goals

• become an expert in a domain/technology/language (databases, machine learning, Python)
• get to a point where you can drop into new situations or technologies and quickly start making a big impact
• do research-y work / something that’s never been done before
• get comfortable with really big codebases
• work on a system that has X scale/complexity (millions of requests per second, etc)
• scale a project way past its original design goals
• do work that saves the company a large amount of money
• be an incident commander for an incident and run the postmortem
• make an contribution to an open source project
• get better at some skill (testing / debugging / a programming language / machine learning)
• become a core maintainer for an important OSS project
• build an important system from scratch
• be involved with a product/project from start to end (over several years)
• understand how complex systems fail (and how to make them not fail)
• be able to build prototypes quickly for new ideas

### job goals

• pass a programming interview
• get your “dream job” (if you have one)
• work at a prestigious company
• work at a very small company
• work at a company for a really long time (to see how things play out over time)
• work at lots of different companies (to get lots of different perspectives)
• get a raise
• become a manager
• get to a specific title (“architect”, “senior engineer”, “CTO”, “developer evangelist”, “principal engineer”)
• work at a nonprofit / company where you believe in the mission
• work on a product that your family / friends would recognize
• work in many different fields
• work in a specific field you care about (transit, security, government)
• get paid to work on a specific project (eg the linux kernel)
• as an academic, have stable funding to work towards your research interests
• become a baker / work on something else entirely :)

### entrepreneurship goals

This category is obviously pretty big (there are lots of start-your-own-business related goals!) and I’m not going to try to be exhaustive.

• start freelancing
• start a consulting company
• make your first sale of software you wrote
• get VC funding / start a startup
• get to X milestone with a company you started

### product goals

I think the difference between “technical goals” and “product goals” is pretty interesting – this area is more about the impact that your programs have on the people who use them than what those programs consist of technically.

• do your work in a specific way that you care about (eg make websites that are accessible)
• build tools for people who you work with directly (this can be so fun!!)
• make a big difference to a system you care about (eg “internet security”)
• do work that helps solve an important problem (climate change, etc)
• work in a team/project whose product affects more than a million people
• work on a product that people love
• build developer tools

• help new people on your team get started
• help someone get a job/opportunity that they wouldn’t have had otherwise
• mentor someone and see them get better over time
• “be a blessing to others you wished someone else was to you”
• be a union organizer / promote fairness at work
• build a more inclusive team
• build a community that matters to people (via a meetup group or otherwise)

### communication / community goals

• write a technical book
• give a talk (meetup, conference talk, keynote)
• give a talk at a really prestigious conference / in front of people you respect
• give a workshop on something you know really well
• start a conference
• write a popular blog / an article that gets upvoted a lot
• teach a class (eg at a high school / college)
• change the way folks in the industry think about something (eg blameless postmortems, fairness in machine learning)

### work environment goals

A lot of people talked about the flexibility to choose their own work environment / hours (eg “work remotely”).

• get flexible hours
• work remotely
• work in a place where you feel accepted/included
• work with people who share your values (this involves knowing what your values are! :) )
• work with people who are very experienced / skilled
• have good health insurance / benefits
• make X amount of money

### other goals

• remain as curious and in love with programming as the first time I did it

### nobody can tell you what your goals are

This post came out of reading this blog post about how your company’s career ladder is probably not the same as your goals and chasing the next promotion may not be the best way to achieve them.

I’ve been lucky enough to have a lot of my basic goals met (“make money”, “learn a lot of things at work”, “work with kind and very competent people”), and after that I’ve found it hard to figure out which of all of these milestones here will actually feel meaningful to me! Sometimes I will achieve a new goal and find that it doesn’t feel very satisfying to have done it. And other times I will do something that I didn’t think was a huge deal to me, but feel really proud of it afterwards.

So it feels pretty useful to me to write down these things and think “do I really want to work at FANCY_COMPANY? would that feel good? do I care about working at a nonprofit? do I want to learn how to build software products that lots of people use? do I want to work on an application that serves a million requests per second? When I accomplished that goal in the past, did it actually feel meaningful, or did I not really care?”

Hello! As you may have noticed, I’ve been writing a few new zines (they’re all at https://jvns.ca/zines ), and while my zines used to be free (or pay-for-early-access-then-free after), the new ones are not free! They cost $10! In this post, I want to talk a little about why I made the switch and how it’s been going so far. ### selling your work is okay I wanted to start out by saying something sort of obvious – if you decide to sell your work instead of giving it away for free, you don’t need to justify that (why would you?). Since I’ve started selling my zines, exactly 0 people have told me “julia, how dare you sell your work”, and a lot of people have said “your work is amazing and I’m happy to pay for it! This is great!” But I still want to talk about this because it’s been a pretty confusing tradeoff for me to think through (what are my goals? does giving things away for free or selling them accomplish my goals better?) ### what are my goals? I don’t have a super clear set of goals with my blog / zines, but here are a few: • expose people to new important ideas that they might never have heard of otherwise. I think in systems a lot of knowledge can be hard to get if you don’t know the right people, I think that’s very silly, and I’d like to make a small dent in that. • explain complicated ideas in the simplest possible way (but not simpler!!!). A lot of things that seem complicated at first actually aren’t really, and I want to show people that. ### free work is easier to distribute The most obvious advantage is that if something is free, it’s way easier for more people to access it and learn from it. For me, this is the biggest thing – I care about the impact of my writing (writing just for myself is useful, but ideally I’d like for it to help lots of people!) A really good example of this is this article Open Access for Impact: How Michael Nielsen Reached 3.5M Readers about Michael Nielsen’s book Neural Networks and Deep Learning. 3.5M readers is probably an overestimate, but he says: total time spent by readers is about 250,000 hours, or roughly 125 full time working years. That’s a lot! This was the biggest reason I held off selling zines for a long time – I worried that if I sold my zines, not that many people would buy them relative to how many folks would download the free versions. ### selling zines makes it easier to spend money (and time) on it A huge advantage of selling zines, though, is that it makes it way easier to invest in making something that’s high-quality. I’ve spent probably$5000 on tablets / printing / software / illustrators to make zines. Since I’ve made substantially more than $5000 at this point (!!!), investing in things like that is now a really easy decision! I can hire super talented illustrators and pay them a fair amount and not worry about it! I decided earlier this year to buy an iPad (which has made drawing zines SO MUCH EASIER for me, the apple pencil is amaaazing), and instead of thinking “oh no, this is kind of expensive, should I really spend money on it?” I could just reason “this is a tool that will more than pay for itself! I should just buy it!“. Also, the fact that I’m making money from it makes it way easier to spend time on the project – any given zine takes me weeks of evenings/weekends to make, and carving that time out of my schedule isn’t always easy! If I’m getting paid for it it makes it way easier to stay motivated to make something awesome instead of producing something kinda half-baked. ### people take things they pay for more seriously Another reason I’m excited about selling zines is that I feel like, since I’ve started doing it and investing a little more into the quality, people have taken the project a little more seriously! • “bite size linux” is a required text in a university course!. This is extremely delightful. • a bunch of folks who work at various companies have bought zines to give to their coworkers/employees! I think “this costs money” is a nice way to signal “I actually spent time on this, this is good”. ### people are actually willing to buy zines At the beginning I said that I was worried that if I sold zines, nobody would buy them, and so nobody would learn from them, and that would be awful. Was that worry justified? Well, I actually have a little bit of data about this!! The only thing I use statistics for on this website is how many people download my zines (I run heap on https://jvns.ca/zines). Here are some stats: • my most-downloaded zine is “so you want to be a wizard” with 5,000 clicks • my most-bought zine is “bite size linux” with 3,000 sales (!!!) 3,000 sales is incredible (thank you everyone!!!!) and I’ve been totally blown away by how many people have bought these zines. This actually feels like selling zines results in more people reading the zine – to me, 3,000 sales is WAY BETTER than 5,000 clicks, because I think that someone who bought a zine is probably like 4x more likely to read it than someone who just clicked on a PDF. (4x being a Totally Unscientific Arbitrary Number). ### how do you decide on pricing? PRICING. EEP. GUYS. I find thinking about pricing SO CONFUSING. There’s this “charge more” narrative I see a lot on the internet which basically goes: • tie whatever you’re selling to someone else’s business outcomes • charge them relative to how much money the product can help them make, not relative to how hard it was to build I think this a reasonable model and it’s how things like this guide to rails performance are priced. This is not really how I’ve been thinking about it, though – my approach right now is just to charge what I think is a reasonable/fair price, which is$10/zine.

I had a super interesting conversation with Stephanie Hurlburt, though, where she argued that I should be charging more for different reasons! Her argument was:

• We want to build a world where artist/educators can get paid fairly for their work
• $10/zine is not actually a lot of money, it’s only sustainable for julia because julia has a big audience • if I could figure out how to charge more, I could share that with other people and make a world where smaller creators could be more successful I find that argument pretty compelling (I would like more people to be able to make money from selling zines!). But I don’t have any plans to charge more for individual zines than$10/zine because $10 just seems like a reasonable price to me and I know that it’s already too much for some folks, especially people in countries where their currency is a lot weaker than the US dollar. ### experimenting with corporate pricing While I’m pretty reluctant to do experiments with the$10/zine price for individual people, experimenting with corporate pricing is a lot easier! Folks generally aren’t spending their own money, so if I raise the prices for a company to buy a zine, maybe they won’t buy it if they decide it’s too much, but it’s a lot less personal and doesn’t affect someone’s ability to read the zines in the same way.

Right now, companies buy zines from me for 2 reasons:

1. to give them to their employees to teach folks useful things (I charge somewhere between $100 ->$600 for a site license right now)
2. to distribute them at conferences/other events (eg microsoft gave out zines/posters by me at a couple of conferences this year). I’ve only just started doing this but it seems like a super fun way to get more zines into the world!

I have been doing some corporate pricing experiments – for Help! I have a manager! I raised the minimum price to $150 because I think it’s pretty valuable to help folks work better with their managers. We’ll see what happens! ### why not patreon? As a sidebar – a lot of folks have suggested that I use Patreon. Right now I definitely do not want to use Patreon/other donation-based models for various reasons (though I support creators on Patreon and I think it’s great!). I don’t want to get into it in this post but maybe I’ll talk about this another time! Basically to me the model of “pay$10 for a zine” is super simple, I like it, and I have no desire to switch to Patreon :)

What I’m doing right now is – I’ll post drafts of almost everything I write in my zines on Twitter. This works really well for a lot of reasons:

• I get really early feedback on whether something is working or not – folks will suggest a lot of great improvements in the Twitter replies!
• I get to see what’s resonating with folks – for example, this comics about 1:1s got 2.5K retweets, which is a lot! Knowing that folks found that page really useful helped me decide where to put it in the zine (near the beginning!)
• people who maybe can’t afford $10 for the zine can follow along on Twitter and get all the information anyway • obviously it’s great advertising – if people like the comics I tweet, they might decide to buy the zine later! :) And if they want to just enjoy the tweets that’s awesome too ❤ As an example, most of the pages from Help! I have a manager! are in this twitter moment. ### a few things that haven’t gone well Not everything has been 100% amazing with selling zines on the internet! A couple of things that haven’t gone well: • some people don’t have credit cards / PayPal and so can’t get the zine! I would really really like a good solution to this. • Gumroad doesn’t have great email deliverability – sometimes when someone buys a zine it’ll end up in their spam. This is pretty easy to resolve (people email me to say that they didn’t get it, and it’s always easy to fix right away), but I wish they were better at this. Otherwise Gumroad is a good platform. • On my first zine, I didn’t put my email address on the Gumroad page, so some people didn’t know how to get in touch with me when there was a problem and one person opened a dispute. Now I put my email address on Gumroad which I think has fixed that! • I sent an update email on Gumroad to past zine buyers saying that I had a new zine out and one person replied to say that they didn’t like being emailed. I think there’s a little room for to improve here – the fact that Gumroad autoenrolls everyone who buys a zine into an “updates” email list is IMO a bit weird and it feels like it would be better if it was opt-in. • Someone posted my blog post announcing a new zine to lobste.rs and folks commented that they didn’t think it was appropriate to post non-free things on lobste.rs. I agree with that but this seems hard to prevent though since I can’t control what people post on tech news sites :). I think this isn’t a big deal but it didn’t feel great. I’m sure I’ll make some more mistakes in the future and hopefully I’ll learn from them :). I wanted to post these because I worry a lot about making mistakes when selling things to folks, but once I write down the issues so far they all feel very resolvable. Mostly I just try to reply to email fast when folks have problems, which isn’t that often. ### let’s see how the experiment goes! So far selling zines feels like • I end up with a comparable amount of readers (I think there’s not a huge difference?) • I can make something that’s higher quality (and pay more artists to help me!). It’s way easier to justify spending time on it. • People take the work more seriously • Folks have been really positive and supportive about it • It’s maybe helping a tiny bit to build a world where more folks can get paid to write really awesome educational materials I’m excited to try out some new things in the future (hopefully printing???). I’ll try to keep writing about what I learn as I go, because how to do this really hasn’t been obvious to me. I’d love to hear what you think! ### New zine: Help! I have a manager! I just released a new zine! It’s called “Help! I have a manager!” This zine is everything I wish somebody had told me when I started out in my career and had no idea how I was supposed to work with my manager. Basically I’ve learned along the way that even when I have a great manager, there are still a lot of things I can do to make sure that we work well together, mostly around communicating clearly! So this zine is about how to do that. You can get it for$10 at https://gum.co/manager-zine. Here’s the cover and table of contents:

The cover art is by Deise Lino. Tons of people helped me write this zine – thanks to Allison, Brett, Jay, Kamal, Maggie, Marc, Marco, Maya, Will, and many others.

### a couple of my favorite pages from the zine

I’ve been posting pages from the zine on twitter as I’ve been working on it. Here are a couple that I think are especially useful – some tips for what even to talk about in 1:1s, and how to do better at asking for feedback.

### Build impossible programs

Hello! My talk from Deconstruct this year (“Build impossible programs”) is up. It’s about my experience building a Ruby profiler. This is the second talk I’ve given about building a profiler – the first one (Building a Ruby profiler) was more of a tech deep dive. This one is a squishier talk about myths I believed about doing ambitious work and how a lot of those myths turn out not to be true.

There’s a transcript on Deconstruct’s site. They’re also gradually putting up the other talks from Deconstruct 2018, which were generally excellent.

### slides

As usual these days I drew the slides by hand. It’s way easier/faster, and it’s more fun.

### zine side note

One extremely awesome thing that happened at Deconstruct was that Gary agreed to print 2300 zines to give away to folks at the conference. They all got taken home which was really nice to see :)

### An awesome new Python profiler: py-spy!

The other day I learned that Ben Frederickson has written an awesome new Python profiler called py-spy!

It takes a similar approach to profiling as rbspy, the profiler I worked on earlier this year – it can profile any running Python program, it uses process_vm_readv to read memory, and it by default displays profiling information in a really easy-to-use way.

Obviously, think this is SO COOL. Here’s what it looks like profiling a Python program: (gif taken from the github README)

It has this great top-like output by default. The default UI is somewhat similar to rbspy’s, but feels better executed to me :)

### you can install it with pip!

Another thing he’s done that’s really nice is make it installable with pip – you can run pip install py-spy and have it download a binary immediately! This is cool because, even though py-spy is a Rust program, obviously Python programmers are used to installing software with pip and not cargo.

In the README he describes what he had to do to distribute a Rust executable with pip without requiring that users have a Rust compiler installed.

### pyspy probably is more stable than rbspy!

Another nice thing py-spy is that I believe it only uses Python’s public bindings (eg Python.h). What I mean by “public bindings” is the header files you’d find in libpython-dev.

rbspy by contrast uses a bunch of header files from inside the Ruby interpreter. This is because Python for whatever reason includes a lot more struct definitions in its header files.

As a result, if you compare py-spy’s python bindings to rbspy’s ruby bindings, you’ll notice that

• there are way fewer Python binding files (6 vs 42 for Ruby)
• each file is much smaller (~30kb vs 200kb for Ruby)

Basically what I think this means is that py-spy is likely to be easier to maintain longterm than rbspy – since rbspy depends on unstable internal Ruby interfaces, even though it works relatively well today, future versions of Ruby could break it at any time.

### the start of an ecosystem of profilers in Rust?? :)

One thing that I think is super nice is that rbspy & py-spy share some code! There’s this proc-maps crate that Ben extracted from rbspy and improved substantially. I think this is awesome because if someone wants to make a py-spy/rbspy-like profiler in Rust for another language like Perl or Javascript or something, it’s even easier!

It turns out that phpspy is a sampling profiler for PHP, too!

I have this secret dream that we could eventually have a suite of open source profilers for lots of different programming languages that all have similar user interfaces. Today every single profiling tool is different and it’s a pain.

### also rbspy has windows support now!

Ben also contributed Windows support to rbspy, which was amazing, and py-spy has Windows support from the start.

So if you want to profile Ruby or Python programs on Windows, you can!

Page created: Fri, Mar 08, 2019 - 09:00 PM GMT