home

# Julia Evans

http://jvns.ca/atom.xml
2019-02-10T12:55:52+00:00
https://jvns.ca/atom.xml

### Networking tool comics!

Hello! I haven’t been blogging too much recently because I’m working on a new zine project: Linux networking tools!

I’m pretty excited about this one – I LOVE computer networking (it’s what I spent a big chunk of the last few years at work doing), but getting started with all the tools was originally a little tricky! For example – what if you have the IP address of a server and you want to make a https connection to it and check that it has a valid certificate? But you haven’t changed DNS to resolve to that server yet (because you don’t know if it works!) so you need to use the IP address? If you do curl https://1.2.3.4/, curl will tell you that the certificate isn’t valid (because it’s not valid for 1.2.3.4). So you need to know to do curl https://jvns.ca --resolve jvns.ca:443:104.198.14.52.

I know how to use curl --resolve because my coworker told me how. And I learned that to find out when a cert expires you can do openssl x509 -in YOURCERT.pem -text -noout the same way. So the goal with this zine is basically to be “your very helpful coworker who gives you tips about how to use networking tools” in case you don’t have that person.

And as we know, a lot of these tools have VERY LONG man pages and you only usually need to know like 5 command line options to do 90% of what you want to do. For example I only ever do maybe 4 things with openssl even though the openssl man pages together have more than 60,000 words.

There are a few things I’m also adding (like ethtool and nmap and tc) which I don’t personally use super often but I think are super useful to people with different jobs than me. And I’m a big fan of mixing more advanced things (like tc) with basic things (like ssh) because then even if you’re learning the basic things for the first time, you can learn that the advanced thing exists!

Here’s some work in progress:

It’s been super fun to draw these: I didn’t know about ssh-copy-id or ~. before I made that ssh comic and I really wish I’d known about them earlier!

As usual I’ll announce the zine when it comes out here, or you can sign up for announcements at https://wizardzines.com/mailing-list/.

### A few early marketing thoughts

At some point last month I said I might write more about business, so here are some very early marketing thoughts for my zine business (https://wizardzines.com!). The question I’m trying to make some progress on in this post is: “how to do marketing in a way that feels good?”

### what’s the point of marketing?

Okay! What’s marketing? What’s the point? I think the ideal way marketing works is:

1. you somehow tell a person about a thing
2. you explain somehow why the thing will be useful to them / why it is good
3. they buy it and they like the thing because it’s what they expected

(or, when you explain it they see that they don’t want it and don’t buy it which is good too!!)

So basically as far as I can tell good marketing is just explaining what the thing is and why it is good in a clear way.

### what internet marketing techniques do people use?

I’ve been thinking a bit about internet marketing techniques I see people using on me recently. Here are a few examples of internet marketing techniques I’ve seen:

1. word of mouth (“have you seen this cool new thing?!”)
3. email marketing (“build a mailing list with a bajillion people on it and sell to them”)
4. email marketing (“tell your existing users about features that they already have that they might want to use”)
5. social proof marketing (“jane from georgia bought a sweater”), eg fomo.com
7. content marketing (which is fine but whenever people refer to my writing as ‘content’ I get grumpy :))

Something that is definitely true about marketing is that you need some way to tell new people about the thing you are doing. So for me when I’m thinking about running a business it’s less about “should i do marketing” and more like “well obviously i have to do marketing, how do i do it in a way that i feel good about?”

### what’s up with email marketing?

I feel like every single piece of internet marketing advice I read says “you need a mailing list”. This is advice that I haven’t really taken to heart – technically I have 2 mailing lists:

1. the RSS feed for this blog, which sends out new blog posts to a mailing list for folks who don’t use RSS (which 3000 of you get)
2. https://wizardzines.com's list, for comics / new zine announcements (780 people subscribe to that! thank you!)

but definitely neither of them is a Machine For Making Sales and I’ve put in almost no efforts in that direction yet.

here are a few things I’ve noticed about marketing mailing lists:

• most marketing mailing lists are boring but some marketing mailing lists are actually interesting! For example I kind of like amy hoy’s emails.
• Someone told me recently that they have 200,000 people on their mailing list (?!!) which made the “a mailing list is a machine for making money” concept make a lot more sense to me. I wonder if people who make a lot of money from their mailing lists all have huge 10k+ person mailing lists like this?

### what works for me: twitter

Right now for my zines business I’d guess maybe 70% of my sales come from Twitter. The main thing I do is tweet pages from zines I’m working on (for example: yesterday’s comic about ss). The comics are usually good and fun so invariably they get tons of retweets, which means that I end up with lots of followers, which means that when I later put up the zine for sale lots of people will buy it.

And of course people don’t have to buy the zines, I post most of what ends up in my zines on twitter for free, so it feels like a nice way to do it. Everybody wins, I think.

(side note: when I started getting tons of new followers from my comics I was actually super worried that it would make my experience of Twitter way worse. That hasn’t happened! the new followers all seem totally reasonable and I still get a lot of really interesting twitter replies which is wonderful ❤)

I don’t try to hack/optimize this really: I just post comics when I make them and I try to make them good.

### a small Twitter innovation: putting my website on the comics

Here’s one small marketing change that I made that I think makes sense!

After a while, I realized people were asking me all the time “hey, can I buy a book/collection? where do these come from? how do I get more?“! I think a marketing secret is “people actually want to buy things that are good, it is useful to tell people where they can buy things that are good”.

So just recently I’ve started adding my website and a note about my current project on the comics I post on Twitter. It doesn’t say much: just “❤ these comics? buy a collection! wizardzines.com” and “page 11 of my upcoming bite size networking zine”. Here’s what it looks like:

I feel like this strikes a pretty good balance between “julia you need to tell people what you’re doing otherwise how are they supposed to buy things from you” and “omg too many sales pitches everywhere”? I’ve only started doing this recently so we’ll see how it goes.

### should I work on a mailing list?

It seems like the same thing that works on twitter would work by email if I wanted to put in the time (email people comics! when a zine comes out, email them about the zine and they can buy it if they want!).

One thing I LOVE about Twitter though is that people always reply to the comics I post with their own tips and tricks that they love and I often learn something new. I feel like email would be nowhere near as fun :)

But I still think this is a pretty good idea: keeping up with twitter can be time consuming and I bet a lot of people would like to get occasional email with programming drawings. (would you?)

One thing I’m not sure about is – a lot of marketing mailing lists seem to use somewhat aggressive techniques to get new emails (a lot of popups on a website, or adding everyone who signs up to their service / buys a thing to a marketing list) and while I’m basically fine with that (unsubscribing is easy!), I’m not sure that it’s what I’d want to do, and maybe less aggressive techniques will work just as well? We’ll see.

### should I track conversion rates?

A piece of marketing advice I assume people give a lot is “be data driven, figure out what things convert the best, etc”. I don’t do this almost at all – gumroad used to tell me that most of my sales came from Twitter which was good to know, but right now I have basically no idea how it works.

Doing a bunch of work to track conversion rates feels bad to me: it seems like it would be really easy to go down a dumb rabbit hole of “oh, let’s try to increase conversion by 5%” instead of just focusing on making really good and cool things.

My guess is that what will work best for me for a while is to have some data that tells me in broad strokes how the business works (like “about 70% of sales come from twitter”) and just leave it at that.

• julia: “wait, are ads marketing?”
• kamal: “yes ads are marketing”

• how do you choose what keywords to advertise on?
• are there actually cheap keywords, like is ‘file descriptors’ cheap?
• how much do you need to pay per click? (for some weird linux keywords, google estimated 20 cents a click?)
• can you use ads effectively for something that costs $10? This seems nontrivial to learn about and I don’t think I’m going to try soon. ### other marketing things a few other things I’ve thought about: • I learned about “social proof marketing” sites like fomo.com yesterday which makes popups on your site like “someone bought COOL THING 3 hours ago”. This seems like it has some utility (people are actually buying things from me all the time, maybe that’s useful to share somehow?) but those popups feel a bit cheap to me and I don’t really think it’s something I’d want to do right now. • similarly a lot of sites like to inject these popups like “HELLO PLEASE SIGN UP FOR OUR MAILING LIST”. similar thoughts. I’ve been putting an email signup link in the footer which seems like a good balance between discoverable and annoying. As an example of a popup which isn’t too intrusive, though: nate berkopec has one on his site which feels really reasonable! (scroll to the bottom to see it) Maybe marketing is all about “make your things discoverable without being annoying”? :) ### that’s all! Hopefully some of this was interesting! Obviously the most important thing in all of this is to make cool things that are useful to people, but I think cool useful writing does not actually sell itself! If you have thoughts about what kinds of marketing have worked well for you / you’ve felt good about I would love to hear them! ### Some nonparametric statistics math I’m trying to understand nonparametric statistics a little more formally. This post may not be that intelligible because I’m still pretty confused about nonparametric statistics, there is a lot of math, and I make no attempt to explain any of the math notation. I’m working towards being able to explain this stuff in a much more accessible way but first I would like to understand some of the math! There’s some MathJax in this post so the math may or may not render in an RSS reader. Some questions I’m interested in: • what is nonparametric statistics exactly? • what guarantees can we make? are there formulas we can use? • why do methods like the bootstrap method work? since these notes are from reading a math book and math books are extremely dense this is basically going to be “I read 7 pages of this math book and here are some points I’m confused about” ### what’s nonparametric statistics? Today I’m looking at “all of nonparametric statistics” by Larry Wasserman. He defines nonparametric inference as: a set of modern statistical methods that aim to keep the number of underlying assumptions as weak as possible Basically my interpretation of this is that – instead of assuming that your data comes from a specific family of distributions (like the normal distribution) and then trying to estimate the paramters of that distribution, you don’t make many assumptions about the distribution (“this is just some data!!“). Not having to make assumptions is nice! There aren’t no assumptions though – he says we assume that the distribution$F$lies in some set$\mathfrak{F}$called a statistical model. For example, when estimating a density$f$, we might assume that $$f \in \mathfrak{F} = \left\{ g : \int(g^{\prime\prime}(x))^2dx \leq c^2 \right\}$$ which is the set of densities that are not “too wiggly”. I have not too much intuition for the condition$\int(g^{\prime\prime}(x))^2dx \leq c^2$. I calculated that integral for the normal distribution on wolfram alpha and got 4, which is a good start. (4 is not infinity!) some questions I still have about this definition: • what’s an example of a probability density function that doesn’t satisfy that$\int(g^{\prime\prime}(x))^2dx \leq c^2$condition? (probably something with an infinite number of tiny wiggles, and I don’t think any distribution i’m interested in in practice would have an infinite number of tiny wiggles?) • why does the density function being “too wiggly” cause problems for nonparametric inference? very unclear as yet. ### we still have to assume independence One assumption we won’t get away from is that the samples in the data we’re dealing with are independent. Often data in the real world actually isn’t really independent, but I think the what people do a lot of the time is to make a good effort at something approaching independence and then close your eyes and pretend it is? ### estimating the density function Okay! Here’s a useful section! Let’s say that I have 100,000 data points from a distribution. I can draw a histogram like this of those data points: If I have 100,000 data points, it’s pretty likely that that histogram is pretty close to the actual distribution. But this is math, so we should be able to make that statement precise, right? For example suppose that 5% of the points in my sample are more than 100. Is the probability that a point is greater than 100 actually 0.05? The book gives a nice formula for this: $$\mathbb{P}(|\widehat{P}_n(A) - P(A)| > \epsilon ) \leq 2e^{-2n\epsilon^2}$$ (by “Hoeffding’s inequality” which I’ve never heard of before). Fun aside about that inequality: here’s a nice jupyter notebook by henry wallace using it to identify the most common Boggle words. here, in our example: • n is 1000 (the number of data points we have) •$A$is the set of points more than 100 •$\widehat{P}_n(A)$is the empirical probability that a point is more than 100 (0.05) •$P(A)$is the actual probability •$\epsilon$is how certain we want to be that we’re right So, what’s the probability that the real probability is between 0.04 and 0.06?$\epsilon = 0.01$, so it’s$2e^{-2 \times 100,000 \times (0.01)^2} = 4e^{-9} $ish (according to wolfram alpha) here is a table of how sure we can be: • 100,000 data points: 4e-9 (TOTALLY CERTAIN that 4% - 6% of points are more than 100) • 10,000 data points: 0.27 (27% probability that we’re wrong! that’s… not bad?) • 1,000 data points: 1.6 (we know the probability we’re wrong is less than.. 160%? that’s not good!) • 100 data points: lol so basically, in this case, using this formula: 100,000 data points is AMAZING, 10,000 data points is pretty good, and 1,000 is much less useful. If we have 1000 data points and we see that 5% of them are more than 100, we DEFINITELY CANNOT CONCLUDE that 4% to 6% of points are more than 100. But (using the same formula) we can use$\epsilon = 0.04$and conclude that with 92% probability 1% to 9% of points are more than 100. So we can still learn some stuff from 1000 data points! This intuitively feels pretty reasonable to me – like it makes sense to me that if you have NO IDEA what your distribution that with 100,000 points you’d be able to make quite strong inferences, and that with 1000 you can do a lot less! ### more data points are exponentially better? One thing that I think is really cool about this estimating the density function formula is that how sure you can be of your inferences scales exponentially with the size of your dataset (this is the$e^{-n\epsilon^2}$). And also exponentially with the square of how sure you want to be (so wanting to be sure within 0.01 is VERY DIFFERENT than within 0.04). So 100,000 data points isn’t 10x better than 10,000 data points, it’s actually like 10000000000000x better. Is that true in other places? If so that seems like a super useful intuition! I still feel pretty uncertain about this, but having some basic intuition about “how much more useful is 10,000 data points than 1,000 data points?“) feels like a really good thing. ### some math about the bootstrap The next chapter is about the bootstrap! Basically the way the bootstrap works is: 1. you want to estimate some statistic (like the median) of your distribution 2. the bootstrap lets you get an estimate and also the variance of that estimate 3. you do this by repeatedly sampling with replacement from your data and then calculating the statistic you want (like the median) on your samples I’m not going to go too much into how to implement the bootstrap method because it’s explained in a lot of place on the internet. Let’s talk about the math! I think in order to say anything meaningful about bootstrap estimates I need to learn a new term: a consistent estimator. ### What’s a consistent estimator? Wikipedia says: In statistics, a consistent estimator or asymptotically consistent estimator is an estimator — a rule for computing estimates of a parameter$\theta_0$— having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to$\theta_0$. This includes some terms where I forget what they mean (what’s “converges in probability” again?). But this seems like a very good thing! If I’m estimating some parameter (like the median), I would DEFINITELY LIKE IT TO BE TRUE that if I do it with an infinite amount of data then my estimate works. An estimator that is not consistent does not sound very useful! ### why/when are bootstrap estimators consistent? spoiler: I have no idea. The book says the following: Consistency of the boostrap can now be expressed as follows. 3.19 Theorem. Suppose that$\mathbb{E}(X_1^2) < \infty$. Let$T_n = g(\overline{X}_n)$where$g$is continuously differentiable at$\mu = \mathbb{E}(X_1)$and that$g\prime(\mu) \neq 0$. Then, $$\sup_u | \mathbb{P}_{\widehat{F}_n} \left( \sqrt{n} (T( \widehat{F}_n*) - T( \widehat{F}_n) \leq u \right) - \mathbb{P}_{\widehat{F}} \left( \sqrt{n} (T( \widehat{F}_n) - T( \widehat{F}) \leq u \right) | \rightarrow^\text{a.s.} 0$$ 3.21 Theorem. Suppose that$T(F)$is Hadamard differentiable with respect to$d(F,G)= sup_x|F(x)-G(x)|$and that$0 < \int L^2_F(x) dF(x) < \infty$. Then, $$\sup_u | \mathbb{P}_{\widehat{F}_n} \left( \sqrt{n} (T( \widehat{F}_n*) - T( \widehat{F}_n) \leq u \right) - \mathbb{P}_{\widehat{F}} \left( \sqrt{n} (T( \widehat{F}_n) - T( \widehat{F}) \leq u \right) | \rightarrow^\text{P} 0$$ things I understand about these theorems: • the two formulas they’re concluding are the same, except I think one is about convergence “almost surely” and one about “convergence in probability”. I don’t remember what either of those mean. • I think for our purposes of doing Regular Boring Things we can replace “Hadamard differentiable” with “differentiable” • I think they don’t actually show the consistency of the bootstrap, they’re actually about consistency of the bootstrap confidence interval estimate (which is a different thing) I don’t really understand how they’re related to consistency, and in particular the$\sup_u$thing is weird, like if you’re looking at$\mathbb{P}(something < u)$, wouldn’t you want to minimize$u$and not maximize it? Maybe it’s a typo and it should be$\inf_u$? it concludes: there is a tendency to treat the bootstrap as a panacea for all problems. But the bootstrap requires regularity conditions to yield valid answers. It should not be applied blindly. ### this book does not seem to explain why the bootstrap is consistent In the appendix (3.7) it gives a sketch of a proof for showing that estimating the median using the bootstrap is consistent. I don’t think this book actually gives a proof anywhere that bootstrap estimates in general are consistent, which was pretty surprising to me. It gives a bunch of references to papers. Though I guess bootstrap confidence intervals are the most important thing? ### that’s all for now This is all extremely stream of consciousness and I only spent 2 hours trying to work through this, but some things I think I learned in the last couple hours are: 1. maybe having more data is exponentially better? (is this true??) 2. “consistency” of an estimator is a thing, not all estimators are consistent 3. understanding when/why nonparametric bootstrap estimators are consistent in general might be very hard (the proof that the bootstrap median estimator is consistent already seems very complicated!) 4. boostrap confidence intervals are not the same thing as bootstrap estimators. Maybe I’ll learn the difference next! ### 2018: Year in review I wrote these in 2015 and 2016 and 2017 and it’s always interesting to look back at them, so here’s a summary of what went on in my side projects in 2018. ### ruby profiler! At the beginning of this year I wrote rbspy (docs: https://rbspy.github.io/). It inspired a Python version called py-spy and a PHP profiler called phpspy, both of which are excellent. I think py-spy in particular is probably better than rbspy which makes me really happy. Writing a program that does something innovative (top for your Ruby program’s functions!) and inspiring other people to make amazing new tools is something I’m really proud of. ### started a side business! A very surprising thing that happened in 2018 is that I started a business! This is the website: https://wizardzines.com/, and I sell programming zines. It’s been astonishingly successful (it definitely made me enough money that I could have lived on just the revenue from the business this year), and I’m really grateful to everyone’s who’s supported that work. I hope the zines have helped you. I always thought that it was impossible to make anywhere near as much money teaching people useful things as I can as a software developer, and now I think that’s not true. I don’t think that I’d want to make that switch (I like working as a programmer!), but now I actually think that if I was serious about it and was interested in working on my business skills, I could probably make it work. I don’t really know what’s next, but I plan to write at least one zine next year. I learned a few things about business this year, mainly from: I used to think that sales / marketing had to be gross, but reading some of these business books made me think that it’s actually possible to run a business by being honest & just building good things. ### work! this is mostly about side projects, but a few things about work: • I still have the same manager (jay). He’s been really great to work with. The help! i have a manager! zine is secretly largely things I learned from working with him. • my team made some big networking infrastructure changes and it went pretty well. I learned a lot about proxies/TLS and a little bit about C++. • I mentored another intern, and the intern I mentored last year joined us full time! When I go back to work I’m going to switch to working on something COMPLETELY DIFFERENT (writing code that sends messages to banks!) for 3 months. It’s a lot closer to the company’s core business, and I think it’ll be neat to learn more about how financial infastracture works. I struggled a bit with understanding/defining my job this year. I wrote What’s a senior engineer’s job? about that, but I have not yet reached enlightenment. ### talks! I gave 4 talks in 2018: • So you want to be a wizard at StarCon • Building a Ruby profiler at the Recurse Center’s localhost series • Build Impossible Programs in May at Deconstruct. • High Reliability Infrastructure Migrations at Kubecon. I’m pretty happy about this talk because I’ve wanted to give a good talk about what I do at work for a long time and I think I finally succeeded. Previously when I gave talks about my work I think I fell into the trap of just describing what we do (“we do X Y Z” … “okay, so what?“). With this one, I think I was able to actually say things that were useful to other people. In past years I’ve mostly given talks which can mostly be summarized “here are some cool tools” and “here is how to learn hard things”. This year I changed focus to giving talks about the actual work I do – there were two talks about building a Ruby profiler, and one about what I do at work (I spend a lot of time on infrastructure migrations!) I’m not sure whether if I’ll give any talks in 2019. I travelled more than I wanted to in 2018, and to stay sane I ended up having to cancel on a talk I was planning to give with relatively short notice which wasn’t good. ### podcasts! I also experimented a bit with a new format: the podcast! These were basically all really fun! They don’t take that long (about 2 hours total?). what I learned about doing podcasts: • It’s really important to give the hosts a list of good questions to ask, and to be prepared to give good answers to those questions! I’m not a super polished podcast guest. • you need a good microphone. At least one of these people told me I actually couldn’t be on their podcast unless I had a good enough microphone, so I bought a medium fancy microphone. It wasn’t too expensive and it’s nice to have a better quality microphone! Maybe I will use it more to record audio/video at some point! ### !!Con I co-organized !!Con for the 4th time – I ran sponsorships. It’s always such a delight and the speakers are so great. !!Con is expanding to the west coast in 2019 – I’m not directly involved with that but it’s going to be amazing. ### blog posts! I apparently wrote 54 blog posts in 2018. A couple of my favourites are What’s a senior engineer’s job? , How to teach yourself hard things, and batch editing files with ed. There were basically 4 themes in blogging for 2018: • progress on the rbspy project while I was working on it (this category) • computer networking / infrastructure engineering (basically all I did at work this year was networking, though I didn’t write about it as much as I might have) • musings about zines / business / developer education, for instance why sell zines? and who pays to educate developers? • a few of the usual “how do you learn things” / “how do you succeed at your job” posts as I figure things about about that, for instance working remotely, 4 years in ### a tiny inclusion project: a guide to performance reviews Last year in addition to my actual job, I did a couple of projects at work towards helping make sure the performance/promotion process works well for folks – i collaborated with the amazing karla on the idea of a “brag document”, and redid our engineering levels. This year, in the same vein, I wrote a document called the “Unofficial guide to the performance reviews”. A lot of folks said it helped them but probably it’s too early to celebrate. I think explaining to folks how the performance review process actually works and how to approach it is really valuable and I might try to publish a more general version here at some point. I like that I work at a place where it’s possible/encouraged to do projects like this. I spend a relatively small amount of time on them (maybe I spent 15 hours on this one?) but it feels good to be able to make tiny steps towards building a better workplace from time to time. It’s really hard to judge the results though! ### conclusions? some things that worked in 2018: • setting boundaries around what my job is • doing open source work while being paid for it • starting a side business • doing small inclusion projects at work • writing zines is very time consuming but I feel happy about the time I spent on that • blogging is always great ### New talk: High Reliability Infrastructure Migrations On Tuesday I gave a talk at KubeCon called High Reliability Infrastructure Migrations. The abstract was: For companies with high availability requirements (99.99% uptime or higher), running new software in production comes with a lot of risks. But it’s possible to make significant infrastructure changes while maintaining the availability your customers expect! I’ll give you a toolbox for derisking migrations and making infrastructure changes with confidence, with examples from our Kubernetes & Envoy experience at Stripe. ## video ### slides Here are the slides: since everyone always asks, I drew them in the Notability app on an iPad. I do this because it’s faster than trying to use regular slides software and I can make better slides. ## a few notes Here are a few links & notes about things I mentioned in the talk ### skycfg: write functions, not YAML I talked about how my team is working on non-YAML interfaces for configuring Kubernetes. The demo is at skycfg.fun, and it’s on GitHub here. It’s based on Starlark, a configuration language that’s a subset of Python. My coworker John has promised that he’ll write a blog post about it at some point, and I’m hoping that’s coming soon :) ### no haunted forests I mentioned a deploy system rewrite we did. John has a great blog post about when rewrites are a good idea and how he approached that rewrite called no haunted forests. ### ignore most kubernetes ecosystem software One small point that I made in the talk was that on my team we ignore almost all software in the Kubernetes ecosystem so that we can focus on a few core pieces (Kubernetes & Envoy, plus some small things like kiam). I wanted to mention this because I think often in Kubernetes land it can seem like everyone is using Cool New Things (helm! istio! knative! eep!). I’m sure those projects are great but I find it much simpler to stay focused on the basics and I wanted people to know that it’s okay to do that if that’s what works for your company. I think the reality is that actually a lot of folks are still trying to work out how to use this new software in a reliable and secure way. ### other talks I haven’t watched other Kubecon talks yet, but here are 2 links: I heard good things about this keynote from melanie cebula about kubernetes at airbnb, and I’m excited to see this talk about kubernetes security. The slides from that security talk look useful Also I’m very excited to see Kelsey Hightower’s keynote as always, but that recording isn’t up yet. If you have other Kubecon talks to recommend I’d love to know what they are. ### my first work talk I’m happy with I usually give talks about debugging tools, or side projects, or how I approach my job at a high level – not on the actual work that I do at my job. What I talked about in this talk is basically what I’ve been learning how to do at work for the last ~2 years. Figuring out how to make big infrastructure changes safely took me a long time (and I’m not done!), and so I hope this talk helps other folks do the same thing. ### How do you document a tech project with comics? Every so often I get email from people saying basically “hey julia! we have an open source project! we’d like to use comics / zines / art to document our project! Can we hire you?“. spoiler: the answer is “no, you can’t hire me” – I don’t do commissions. But I do think this is a cool idea and I’ve often wished I had something more useful to say to people than “no”, so if you’re interested in this, here are some ideas about how to accomplish it! ### zine != drawing First, a terminology distinction. One weird thing I’ve noticed is that people frequently refer to individual tech drawings as “zines”. I think this is due to me communicating poorly somehow, but – drawings are not zines! A zine is a printed booklet, like a small magazine. You wouldn’t call a photo of a model in Vogue a magazine! The magazine has like a million pages! An individual drawing is a drawing/comic/graphic/whatever. Just clarifying this because I think it causes a bit of unnecessary confusion. ### comics without good information are useless Usually when folks ask me “hey, could we make a comic explaining X”, it doesn’t seem like they have a clear idea of what information exactly they want to get across, they just have a vague idea that maybe it would be cool to draw some comics. This makes sense – figuring out what information would be useful to tell people is very hard!! It’s 80% of what I spend my time on when making comics. You should think about comics the same way as any kind of documentation – start with the information you want to convey, who your target audience is, and how you want to distribute it (twitter? on your website? in person?), and figure out how to illustrate it after :). The information is the main thing, not the art! Once you have a clear story about what you want to get across, you can start trying to think about how to represent it using illustrations! ### focus on concepts that don’t change Drawing comics is a much bigger investment than writing documentation (it takes me like 5x longer to convey the same information in a comic than in writing). So use it wisely! Because it’s not that easy to edit, if you’re going to make something a comic you want to focus on concepts that are very unlikely to change. So talk about the core ideas in your project instead of the exact command line arguments it takes! Here are a couple of options for how you could use comics/illustrations to document your project! ### option 1: a single graphic One format you might want to try is a single, small graphic explaining what your project is about and why folks might be interested in it. For example: this zulip comic This is a short thing, you could post it on Twitter or print it as a pamphlet to give out. The information content here would probably be basically what’s on your project homepage, but presented in a more fun/exciting way :) You can put a pretty small amount of information in a single comic. With that Zulip comic, the things I picked out were: • zulip is sort of like slack, but it has threads • it’s easy to keep track of threads even if the conversation takes place over several days • you can much more easily selectively catch up with Zulip • zulip is open source • there’s an open zulip server you can try out That’s not a lot of information! It’s 50 words :). So to do this effectively you need to distill your project down to 50 words in a way that’s still useful. It’s not easy! ### option 2: many comics Another approach you can take is to make a more in depth comic / illustration, like google’s guide to kubernetes or the children’s illustrated guide to kubernetes. To do this, you need a much stronger concept than “uh, I want to explain our project” – you want to have a clear target audience in mind! For example, if I were drawing a set of Docker comics, I’d probably focus on folks who want to use Docker in production. so I’d want to discuss: • publishing your containers to a public/private registry • some best practices for tagging your containers • how to make sure your hosts don’t run out of disk space from downloading too many containers • how to use layers to save on disk space / download less stuff • whether it’s reasonable to run the same containers in production & in dev That’s totally different from the set of comics I’d write for folks who just want to use Docker to develop locally! ### option 3: a printed zine The main thing that differentiates this from “many comics” is that zines are printed! Because of that, for this to make sense you need to have a place to give out the printed copies! Maybe you’re going present your project at a major conference? Maybe you give workshops about your project and want to give our the zine to folks in the workshop as notes? Maybe you want to mail it to people? ### how to hire someone to help you There are basically 3 ways to hire someone: 1. Hire someone who both understands (or can quickly learn) the technology you want to document and can illustrate well. These folks are tricky to find and probably expensive (I certainly wouldn’t do a project like this for less than$10,000 even if I did do commissions), just because programmers can usually charge a pretty high consulting rate. I’d guess that the main failure mode here is that it might be impossible/very hard to find someone, and it might be expensive.
2. Collaborate with an illustrator to draw it for you. The main failure mode here is that if you don’t give the illustrator clear explanations of your tech to work with, you.. won’t end up with a clear and useful explanation. From what I’ve seen, most folks underinvest in writing clear explanations for their illustrators – I’ve seen a few really adorable tech comics that I don’t find useful or clear at all. I’d love to see more people do a better job of this. What’s the point of having an adorable illustration if it doesn’t teach anyone anything? :)
3. Draw it yourself :). This is what I do, obviously. stick figures are okay!

Most people seem to use method #2 – I’m not actually aware of any tech folks who have done commissioned comics (though I’m sure it’s happened!). I think method #2 is a great option and I’d love to see more folks do it. Paying illustrators is really fun!

### An example of how C++ destructors are useful in Envoy

For a while now I’ve been working with a C++ project (Envoy), and sometimes I need to contribute to it, so my C++ skills have gone from “nonexistent” to “really minimal”. I’ve learned what an initializer list is and that a method starting with ~ is a destructor. I almost know what an lvalue and an rvalue are but not quite.

But the other day when writing some C++ code I figured out something exciting about how to use destructors that I hadn’t realized! (the tl;dr of this post for people who know C++ is “julia finally understands what RAII is and that it is useful” :))

### what’s a destructor?

C++ has objects. When an C++ object goes out of scope, the compiler inserts a call to its destructor. So if you have some code like

function do_thing() {
Thing x{}; // this calls the Thing constructor
return 2;
}


there will be a call to x’s destructor at the end of the do_thing function. so the code c++ generates looks something like:

• make new thing
• call the new thing’s destructor
• return 2

Obviously destructors are way more complicated like this. They need to get called when there are exceptions! And sometimes they get called manually. And for lots of other reasons too. But there are 10 million things to know about C++ and that is not what we’re doing today, we are just talking about one thing.

### what happens in a destructor?

A lot of the time memory gets freed, which is how you avoid having memory leaks. But that’s not what we’re talking about in this post! We are talking about something more interesting.

### the thing we’re interested in: Envoy circuit breakers

So I’ve been working with Envoy a lot. 3 second Envoy refresher: it’s a HTTP proxy, your application makes requests to Envoy, which then proxies the request to the servers the application wants to talk to.

One very useful feature Envoy has is this thing called “circuit breakers”. Basically the idea with is that if your application makes 50 billion connections to a service, that will probably overwhelm the service. So Envoy keeps track how many TCP connections you’ve made to a service, and will stop you from making new requests if you hit the limit. The default max_connection limit

### how do you track connection count?

To maintain a circuit breaker on the number of TCP connections, that means you need to keep an accurate count of how many TCP connections are currently open! How do you do that? Well, the way it works is to maintain a connections counter and:

• every time a connection is opened, increment the counter
• every time a connection is destroyed (because of a reset / timeout / whatever), decrement the counter
• when creating a new connection, check that the connections counter is not over the limit

that’s all! And incrementing the counter when creating a new connection is pretty easy. But how do you make sure that the counter gets decremented wheh the connection is destroyed? Connections can be destroyed in a lot of ways (they can time out! they can be closed by Envoy! they can be closed by the server! maybe something else I haven’t thought of could happen!) and it seems very easy to accidentally miss a way of closing them.

### destructors to the rescue

The way Envoy solves this problem is to create a connection object (called ActiveClient in the HTTP connection pool) for every connection.

Then it:

• increments the counter in the constructor (code)
• decrements the counter in the destructor (code)
• checks the counter when a new connection is created (code)

The beauty of this is that now you don’t need to make sure that the counter gets decremented in all the right places, you now just need to organize your code so that the ActiveClient object’s destructor gets called when the connection has closed.

Where does the ActiveClient destructor get called in Envoy? Well, Envoy maintains 2 lists of clients (ready_clients and busy_clients), and when a connection gets closed, Envoy removes the client from those lists. And when it does that, it doesn’t need to do any extra cleanup!! In C++, anytime a object is removed from a list, its destructor is called. So client.removeFromList(ready_clients_); takes care of all the cleanup. And there’s no chance of forgetting to decrement the counter!! It will definitely always happen unless you accidentally leave the object on one of these lists, which would be a bug anyway because the connection is closed :)

### RAII

This pattern Envoy is using here is an extremely common C++ programming pattern called “resource acquisition is initialization”. I find that name very confusing but that’s what it’s called. basically the way it works is:

• identify a resource (like “connection”) where a lot of things need to happen when the connection is initialized / finished
• make a class for that connection
• put all the initialization / finishing code in the constructor / destructor
• make sure the object’s destructor method gets called when appropriate! (by removing it from a vector / having it go out of scope)

Previously I knew about using this pattern for kind of obvious things (make sure all the memory gets freed in the destructor, or make sure file descriptors get closed). But I didn’t realize it was also useful for cases that are slightly less obviously a resource like “decrement a counter”.

The reason this pattern works is because the C++ compiler/standard library does a bunch of work to make sure that destructors get called when you’re done with an object – the compiler inserts destructor calls at the end of each block of code, after exceptions, and many standard library collections make sure destructors are called when you remove an object from a collection.

### RAII gives you prompt, deterministic, and hard-to-screw-up cleanup of resources

The exciting thing here is that this programming pattern gives you a way to schedule cleaning up resources that’s:

• easy to ensure always happens (when the object goes away, it always happens, even if there was an exception!)
• prompt & determinstic (it happens right away and it’s guaranteed to happen!)

### what languages have RAII?

C++ and Rust have RAII. Probably other languages too. Java, Python, Go, and garbage collected languages in general do not. In a garbage collected language you can often set up destructors to be run when the object is GC’d. But often (like in this case, which the connection count) you want things to be cleaned up right away when the object is no longer in use, not some indeterminate period later whenever GC happens to run.

Python context managers are a related idea, you could do something like:

with conn_pool.connection() as conn:
do stuff


### that’s all for now!

Hopefully this explanation of RAII is interesting and mostly correct. Thanks to Kamal for clarifying some RAII things for me!

### Some notes on running new software in production

I’m working on a talk for kubecon in December! One of the points I want to get across is the amount of time/investment it takes to use new software in production without causing really serious incidents, and what that’s looked like for us in our use of Kubernetes.

To start out, this post isn’t blanket advice. There are lots of times when it’s totally fine to just use software and not worry about how it works exactly. So let’s start by talking about when it’s important to invest.

### when it matters: 99.99%

If you’re running a service with a low SLO like 99% I don’t think it matters that much to understand the software you run in production. You can be down for like 2 hours a month! If something goes wrong, just fix it and it’s fine.

At 99.99%, it’s different. That’s 45 minutes / year of downtime, and if you find out about a serious issue for the first time in production it could easily take you 20 minutes or to revert the change. That’s half your uptime budget for the year!

### when it matters: software that you’re using heavily

Also, even if you’re running a service with a 99.99% SLO, it’s impossible to develop a super deep understanding of every single piece of software you’re using. For example, a web service might use:

• 100 library dependencies
• the filesystem (so there’s linux filesystem code!)
• the network (linux networking code!)
• a database (like postgres)
• a proxy (like nginx/haproxy)

If you’re only reading like 2 files from disk, you don’t need to do a super deep dive into Linux filesystems internals, you can just read the file from disk.

What I try to do in practice is identify the components which we rely on the (or have the most unusual use cases for!), and invest time into understanding those. These are usually pretty easy to identify because they’re the ones which will cause the most problems :)

### when it matters: new software

Understanding your software especially matters for newer/less mature software projects, because it’s morely likely to have bugs & or just not have matured enough to be used by most people without having to worry. I’ve spent a bunch of time recently with Kubernetes/Envoy which are both relatively new projects, and neither of those are remotely in the category of “oh, it’ll just work, don’t worry about it”. I’ve spent many hours debugging weird surprising edge cases with both of them and learning how to configure them in the right way.

### a playbook for understanding your software

The playbook for understanding the software you run in production is pretty simple. Here it is:

1. Start using it in production in a non-critical capacity (by sending a small percentage of traffic to it, on a less critical service, etc)
2. Let that bake for a few weeks.
3. Run into problems.
4. Fix the problems. Go to step 3.

Repeat until you feel like you have a good handle on this software’s failure modes and are comfortable running it in a more critical capacity. Let’s talk about that in a little more detail, though:

### what running into bugs looks like

For example, I’ve been spending a lot of time with Envoy in the last year. Some of the issues we’ve seen along the way are: (in no particular order)

• One of the default settings resulted in retry & timeout headers not being respected
• Envoy (as a client) doesn’t support TLS session resumption, so servers with a large amount of Envoy clients get DDOSed by TLS handshakes
• Envoy’s active healthchecking means that you services get healthchecked by every client. This is mostly okay but (again) services with many clients can get overwhelmed by it.
• Having every client independently healthcheck every server interacts somewhat poorly with services which are under heavy load, and can exacerbate performance issues by removing up-but-slow clients from the load balancer rotation.
• Envoy doesn’t retry failed connections by default
• it frequently segfaults when given incorrect configuration
• various issues with it segfaulting because of resource leaks / memory safety issues
• hosts running out of disk space between we didn’t rotate Envoy log files often enough

A lot of these aren’t bugs – they’re just cases where what we expected the default configuration to do one thing, and it did another thing. This happens all the time, and it can result in really serious incidents. Figuring out how to configure a complicated piece of software appropriately takes a lot of time, and you just have to account for that.

And Envoy is great software! The maintainers are incredibly responsive, they fix bugs quickly and its performance is good. It’s overall been quite stable and it’s done well in production. But just because something is great software doesn’t mean you won’t also run into 10 or 20 relatively serious issues along the way that need to be addressed in one way or another. And it’s helpful to understand those issues before putting the software in a really critical place.

### try to have each incident only once

My view is that running new software in production inevitably results in incidents. The trick:

1. Make sure the incidents aren’t too serious (by making ‘production’ a less critical system first)
2. Whenever there’s an incident (even if it’s not that serious!!!), spend the time necessary to understand exactly why it happened and how to make sure it doesn’t happen again

My experience so far has been that it’s actually relatively possible to pull off “have every incident only once”. When we investigate issues and implement remediations, usually that issue never comes back. The remediation can either be:

• a configuration change
• reporting a bug upstream and either fixing it ourselves or waiting for a fix
• a workaround (“this software doesn’t work with 10,000 clients? ok, we just won’t use it with in cases where there are that many clients for now!“, “oh, a memory leak? let’s just restart it every hour”)

Knowledge-sharing is really important here too – it’s always unfortunate when one person finds an incident in production, fixes it, but doesn’t explain the issue to the rest of the team so somebody else ends up causing the same incident again later because they didn’t hear about the original incident.

### Understand what is ok to break and isn’t

Another huge part of understanding the software I run in production is understanding which parts are OK to break (aka “if this breaks, it won’t result in a production incident”) and which aren’t. This lets me focus: I can put big boxes around some components and decide “ok, if this breaks it doesn’t matter, so I won’t pay super close attention to it”.

For example, with Kubernetes:

ok to break:

• any stateless control plane component can crash or be cycled out or go down for 5 minutes at any time. If we had 95% uptime for the kubernetes control plane that would probably be fine, it just needs to be working most of the time.
• kubernetes networking (the system where you give every pod an IP addresses) can break as much as it wants because we decided not to use it to start

not ok:

• for us, if etcd goes down for 10 minutes, that’s ok. If it goes down for 2 hours, it’s not
• containers not starting or crashing on startup (iam issues, docker not starting containers, bugs in the scheduler, bugs in other controllers) is serious and needs to be looked at immediately
• containers not having access to the resources they need (because of permissions issues, etc)
• pods being terminated unexpectedly by Kubernetes (if you configure kubernetes wrong it can terminate your pods!)

with Envoy, the breakdown is pretty different:

ok to break:

• if the envoy control plane goes down for 5 minutes, that’s fine (it’ll keep working with stale data)
• segfaults on startup due to configuration errors are sort of okay because they manifest so early and they’re unlikely to surprise us (if the segfault doesn’t happen the 1st time, it shouldn’t happen the 200th time)

not ok:

• Envoy crashes / segfaults are not good – if it crashes, network connections don’t happen
• if the control server serves incorrect or incomplete data that’s extremely dangerous and can result in serious production incidents. (so downtime is fine, but serving incorrect data is not!)

Neither of these lists are complete at all, but they’re examples of what I mean by “understand your sofware”.

### sharing ok to break / not ok lists is useful

I think these “ok to break” / “not ok” lists are really useful to share, because even if they’re not 100% the same for every user, the lessons are pretty hard won. I’d be curious to hear about your breakdown of what kinds of failures are ok / not ok for software you’re using!

Figuring out all the failure modes of a new piece of software and how they apply to your situation can take months. (this is is why when you ask your database team “hey can we just use NEW DATABASE” they look at you in such a pained way). So anything we can do to help other people learn faster is amazing

### Tailwind: style your site without writing any CSS!

Hello! Over the last couple of days I put together a new website for my zines (https://wizardzines.com). To make this website, I needed to write HTML and CSS. Eep!!

Web design really isn’t my strong suit. I’ve been writing mediocre HTML/CSS for probably like 12 years now, and since I don’t do it at all in my job and am making no efforts to improve, the chances of my mediocre CSS skills magically improving are… not good.

But! I want to make websites sometimes, and It’s 2018! All websites need to be responsive! So even if I make a pretty minimalist site, it does need to at least sort of work on phones and tablets and desktops with lots of different screen sizes. I know about CSS and flexboxes and media queries, but in practice putting all of those things together is usually a huge pain.

I ended up making this site with Tailwind CSS, and it helped me make a site I felt pretty happy with my minimal CSS skills and just 2 evenings of work!

The Tailwind author wrote a blog post called CSS Utility Classes and “Separation of Concerns” which you should very possibly read instead of this :).

Until yesterday, what I believed about writing good CSS was living in about 2003 with the CSS zen garden. The CSS zen garden was (and is! it’s still up!) this site which was like “hey everyone!! you can use CSS to style your websites instead of HTML tables! Just write nice semantic HTML and then you can accomplish anything you need to do with CSS! This is amazing!” They show it off by providing lots of different designs for the site, which all use exactly the same HTML. It’s a really fun & creative thing and it obviously made an impression because I remember it like 10 years later.

And it makes sense! The idea that you should write semantic HTML, kind of like this:

div class="zen-resources" id="zen-resources">
<h3 class="resources">Resources:</h3>


and then style those classes.

### writing CSS is not actually working for me

Even though I believe in this CSS zen garden semantic HTML ideal, I feel like writing CSS is not actually really working for me personally. I know some CSS basics – I know font-size and align and min-height and can even sort of use flexboxes and CSS grid. I can mostly center things. I made https://rbspy.github.io/ responsive by writing CSS.

But I only write CSS probably every 4 months or something, and only for tiny personal sites, and in practive I always end up with some media query problem sadly googling “how do I center div” for the 500th time. And everything ends up kind of poorly aligned and eventually I get something that sort of works and hide under the bed.

### CSS frameworks where you don’t write CSS

So! There’s this interesting thing that has happened where now there are CSS frameworks where you don’t actually write any CSS at all to use them! Instead, you just add lots of CSS classes to each element to style it. It’s basically the opposite of the CSS zen garden – you have a single CSS file that you don’t change, and then you use 10 billion classes in your HTML to style your site.

Here’s an example from https://wizardzines.com/zines/manager/. This snippet puts images of the cover and the table of contents side by side.

<div class="flex flex-row flex-wrap justify-center">
<div class="md:w-1/2 md:pr-4">
<img src='cover.png'>
</div>

<div class="md:w-1/2">
<a class="outline-none" href='/zines/manager/toc.png'>
<img src='toc.png'>
</a>
</div>
</div>


Basically the outside div is a flexbox – flex means display: flex, flex-row means flex-direction: row, etc. Most (all?) of the classes apply exactly 1 line of CSS.

### why this zine?

I’ve thought for a couple of years that it might be fun to write a git zine, but I had NO IDEA how to do it. I was in this weird place with git where, even though I know that git is really confusing, I felt like I’d forgotten what it was like to be confused/scared by Git. And I write most things from a place of “I was super confused by this thing just recently, let me explain it!!”.

But then!! I saw that Katie Sylor-Miller had made this delightful website called oh shit, git! explaining how to get out of common git mishaps. I thought this was really brilliant because a lot of the things on that site (“oh shit, i committed to the wrong branch!“) are things I remember being really scary when I was less comfortable with git!

So I thought, maybe this could be useful for folks to have as a paper reference! Maybe we could make a zine out of it! So I emailed her and she agreed to work with me. And now here it is! :D. Very excited to have done a first collaboration.

### what’s new in the oh shit, git! zine?

The zine isn’t the same as the website – we decided we wanted to add some fundamental information about how Git works (what’s a commit?), because to really work with Git effectively you need to understand at least a little bit about how commits and branches work! And some of the explanations are improved. Probably about 50% of the material in the zine is from the website and 50% is new.

### a couple of example pages

Here are a couple of example pages, to give you an idea of what’s in the zine:

and a page on git reflog:

### that might be it for zines in 2018!

I’m not sure, but I don’t think I’ll write any more zines for a couple of months. So far there have been 5 (!!!) this year – perf, bite size linux, bite size command line, help! I have a manager!, and this one!. I’m really happy with that number and very grateful to everyone who’s supported them.

ideas I have for zines right now include:

• kubernetes
• how to do statistics using programming
• ‘bite size networking’, on the 10 billion different command line tools used for different networking things
• ‘bite size linux v2’, about more core linux concepts that i didn’t get to in ‘bite size linux’

There’s a definite tradeoff between writing zines and blogging, and writing blog posts is really fun. Maybe I’ll try going back in that direction for a little.

### Some Envoy basics

Envoy is a newish network proxy/webserver in the same universe as HAProxy and nginx. When I first learned about it around last fall, I was pretty confused by it.

There are a few kinds of questions one might have about any piece of software:

• how does do you use it?
• why is it useful?
• how does it work internally?

I’m going to spend most of my time in this post on “how do you use it?”, because I found a lot of the basics about how to configure Envoy very confusing when I started. I’ll explain some of the Envoy jargon that I was initially confused by (what’s an SDS? XDS? CDS? EDS? ADS? filter? cluster? listener? help!)

There will also be a little bit of “why is it useful?” and nothing at all about the internals.

### What’s Envoy?

Envoy is a network proxy. You compile it, you put it on the server that you want the, you tell it which configuration file to use it, and away you go!

Here’s probably the simplest possible example of using Envoy. The configuration file is a gist. This example starts a webserver on port 7777 that proxies to another HTTP server on port 8000.

If you have Docker, you can try it now – just download the configuration, start the Envoy docker image, and away you go!

python -mSimpleHTTPServer & # Start a HTTP server on port 8000
wget https://gist.githubusercontent.com/jvns/340e4d20c83b16576c02efc08487ed54/raw/1ddc3038ed11c31ddc70be038fd23dddfa13f5d3/envoy_config.json
docker run --rm --net host -v=$PWD:/config envoyproxy/envoy /usr/local/bin/envoy -c /config/envoy_config.json  This will start an Envoy HTTP server, and then you can make a request to Envoy! Just curl localhost:7777 and it’ll proxy the request to localhost:8000. ### Envoy basic concepts: clusters, listeners, routes, and filters This small tiny envoy_config.json we just ran contains all the basic Envoy concepts! First, there’s a listener. This tells Envoy to bind to a port, in this case 7777: "listeners": [{ "address": { "socket_address": { "address": "127.0.0.1", "port_value": 7777 }  Next up, the listener has filters. Filters tell the listener what to do with the requests it receives, and you give Envoy an array of filters. If you’re doing something complicated typically you’ll apply several filters to every requests coming in. There are a few different kinds of filters (see list of TCP filters), but the most important filter is probably the envoy.http_connection_manager filter, which is used for proxying HTTP requests. The HTTP connection manager has a further list of HTTP filters that it applies (see list of HTTP filters). The most important of those is the envoy.router filter which routes requests to the right backend. In our example, here’s how we’ve configured our filters. There’s one TCP filter (envoy.http_connection_manager) which uses 1 HTTP filter (envoy.router) "filters": [ { "name": "envoy.http_connection_manager", "config": { "stat_prefix": "ingress_http", "http_filters": [{ "name": "envoy.router", "config": {} }], ....  Next, let’s talk about routes. You’ll notice that so far we haven’t explained to the envoy.router filter what to do with the requests it receives. Where should it proxy them? What paths should it match? In our case, the answer to that question is going to be “proxy all requests to localhost:8000”. The envoy.router filter is configured with an array of routes. Here’s how they’re configured in our test configuration. In our case there’s just one route. "route_config": { "virtual_hosts": [ { "name": "blah", "domains": "*", "routes": [ { "match": { "prefix": "/" }, "route": { "cluster": "banana" }  This gives a list of domains to match (these are matched against the requests Host header). If we changed "domains": "*" to "domains": "my.cool.service", then we’d need to pass the header Host: my.cool.service to get a response. If you’re paying attention to the ongoing saga of this configuration, you’ll notice that the port 8000 hasn’t been mentioned anywhere. There’s just "cluster": "banana". What’s a cluster? Well, a cluster is a collection of address (IP address / port) that are the backend for a service. For example, if you have 8 machines running a HTTP service, then you might have 8 hosts in your cluster. Every service needs its own cluster. This example cluster is really simple: it’s just a single IP/port, running on localhost.  "clusters":[ { "name": "banana", "type": "STRICT_DNS", "connect_timeout": "1s", "hosts": [ { "socket_address": { "address": "127.0.0.1", "port_value": 8000 } } ] } ]  ### tips for writing Envoy configuration by hand I find writing Envoy configurations from scratch pretty time consuming – there are some examples in the Envoy repository (https://github.com/envoyproxy/envoy), but even after using Envoy for a year this basic configuration actually took me 45 minutes to get right. Here are a few tips: • Envoy has 2 different APIs: the v1 and the v2 API. Many newer features are only available in the v2 API, and I find its documentation a little easier to navigate because it’s automatically generated from protocol buffers. (eg the Cluster docs are generated from cds.proto) • A few good starting points in the Envoy API docs: Listener, Cluster, Filter, Virtual Host. To get all the information you need you need to click a lot (for example to see how to configure the cluster for a route you need to start at “Virtual Host” and click route_config -> virtual_hosts -> routes -> route -> cluster), but it works. • The architecture overview docs are useful and give an overall explanation of how some Envoy things are configured. • You can use either json or yaml to configure Envoy. Above I’ve used JSON. ### You can configure Envoy with a server Even though we started with a configuration file on disk, one thing that makes Envoy really different from HAProxy or nginx is that Envoy often isn’t configured with a configuration file. Instead, you can configure Envoy with one or several configuration servers which dynamically change your configuration. To get an idea of why this might be useful: imagine that you’re using Envoy to load balance requests to 50ish backend servers, which are EC2 instances that you periodically rotate out. So http://your-website.com requests go to Envoy, and get routed to an Envoy cluster, which needs to be a list of the 50 IP addresses and ports of those servers. But what if those servers change over time? Maybe you’re launching new ones or they’re getting terminated. You could handle this by periodically changing the Envoy configuration file and restarting Envoy. Or!! You could set up a “cluster discovery service” (or “CDS”), which for example could query the AWS API and return all the IPs of your backend servers to Envoy. I’m not going to get into the details of how to configure a discovery service, but basically it looks like this (from this template). You tell it how often to refresh and what the address of the server is. dynamic_resources: cds_config: api_config_source: cluster_names: - cds_cluster refresh_delay: 30s ... - name: cds_cluster connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN hosts: - socket_address: protocol: TCP address: cds.yourcompany.net port_value: 80  ### 4 kinds of Envoy discovery services There are 4 kinds of resources you can set up discovery services for Envoy – routes (“what cluster should requests with this HTTP header go to”), clusters (“what backends does this service have?”), listener (the filters for a port), and endpoints. These are called RDS, CDS, LDS, and EDS respectively. XDS is the overall protocol. The easiest way to write a discovery service from scratch is probably in Go using the go-control-plane library. ### some Envoy discovery services It’s definitely possible to write Envoy configuration services from scratch, but there are some other open source projects that implement Envoy discovery services. Here are the ones I know about, though I’m sure there are more: • There’s an open source Envoy discovery service called rotor which looks interesting. The company that built it just shut down a couple weeks ago. • Istio (as far as I understand it) is basically an Envoy discovery service that uses information from the Kubernetes API (eg the services in your cluster) to configure Envoy clusters/routes. It has its own configuration language. • consul might be adding support for Envoy (see this blog post), though I don’t fully understand the status there ### what’s a service mesh? Another term that I hear a lot is “service mesh”. Basically a “service mesh” is where you install Envoy on the same machine as every one of your applications, and proxy all your network requests through Envoy. Basically it gives you more easily control how a bunch of different applications (maybe written in different programming languages) communicate with each other. ### why is Envoy interesting? I think these discovery services are really the exciting thing about Envoy. If all of your network traffic is proxied through Envoy and you control all Envoy configuration from a central server, then you can potentially: • use circuit breaking • route requests to only close instances • encrypt network traffic end-to-end • run controlled code rollouts (want to send only 20% of traffic to the new server you spun up? okay!) all without having to change any application code anywhere. Basically it’s a very powerful/flexible decentralized load balancer. Obviously setting up a bunch of discovery services and operating them and using them to configure your internal network infrastructure in complicated ways is a lot more work than just “write an nginx configuration file and leave it alone”, and it’s probably more complexity than is appropriate for most people. I’m not going to venture into telling you who should or should not use Envoy, but my experience has been that, like Kubernetes, it’s both very powerful and very complicated. ### other exciting things about Envoy: timeout headers and metrics One of the things I really like about Envoy is that you can pass it a HTTP header to tell it how to retry/timeout your requests!! This is amazing because implementing timeout / retry logic correctly works differently in every programming language and people get it wrong ALL THE TIME. So being able to just pass a header is great. The timeout & retry headers are documented here, and here are my favourites: • x-envoy-max-retries: how many times to retry • x-envoy-retry-on: which failures to retry (eg 5xx or connect-failure) • x-envoy-upstream-rq-timeout-ms: total timeout • x-envoy-upstream-rq-per-try-timeout-ms: timeout per retry ### that’s all for now I have a lot of thoughts about Envoy (too many to write in one blog post!), so maybe I’ll say more later! ### What's a senior engineer's job? There’s this great post by John Allspaw called “On being a senior engineer”. I originally read it 4ish years ago when I started my current job and it really influenced how I thought about the direction I wanted to go in. Rereading it 4 years later, one thing that’s really interesting to me about that blog post is that it’s explaining that empathy / helping your team succeed is an important part of being a senior engineer. Which of course is true! But from where I stand today, most (all?) of the senior engineers I know take on a significant amount of helping-other-people work in addition to their individual programming work. The challenge I see me/my coworkers struggling with today isn’t so much “what?? I have to TALK TO PEOPLE?? UNBELIEVABLE.” and more “wait, how do I balance all of this leadership work with my individual contributions / programming work in a way that’s sustainable for me? How much of what kind of work should I be doing?“. So instead of talking about the attributes that a senior engineer has from Allspaw’s post (which I totally agree with), instead I want to talk here about the work that a senior engineer does. ### what this post is describing “what a senior engineer does” is a huge topic and this is a small post. things to keep in mind when reading: • this is just one possible description of what a “senior engineer” could do. There are a lot of ways to work and this isn’t intended to be definitive. • I have basically only worked at one company and this is just about my experiences so my perspective is obviously pretty limited • There are obviously a lot of levels of “senior engineer” out there. This is aimed somewhere around P3/P4 in the Mozilla ladder (senior engineer / staff engineer), maybe a bit more on the “staff” side. ### What’s part of the job These are things that I view as being mostly a senior engineer’s job and less a manager’s job. (though managers definitely do some of this too, especially creating new projects / relating projects to business priorities) The thing that holds all this together is that almost all of this work is fundamentally technical: helping someone get unstuck on a tricky project is obviously a human interaction, but the issues we’ll be working on together will generally be computer issues! (“maybe if we simplify this design we can be done with this way sooner!“) • Write code. (obviously) • Do code reviews. (obviously) • Write and review design docs. As with other review tasks, I think of “review design docs” as “get a second set of eyes on it, which will probably help improve the design”. • Help team members when they’re stuck. Sometimes folks get stuck on a project, and it’s important to work to support them! I think of this less as “parachute from the sky and deliver your magical knowledge to people” and more as “work together to understand the problem they’re trying to solve and see if 2 brains are better than 1” :). This also means working with someone to solve the problem instead of solving the problem for them. • Hold folks to a high quality standard. “Quality” will mean different things for different folks (for my team it means reliability/security/usability). Usually when someone makes a decision that seems off to me, it’s either because I know something that they don’t or they know something I don’t! So instead of telling someone “hey you did this wrong you should do X instead”, I try to just give them some extra information that they didn’t have and often that sorts it out. And pretty often it turns out that I was missing something and actually their decision was totally reasonable! In the past I’ve very occasionally seen senior engineers try to enforce quality standards by repeating their opinions more and more loudly because they think their opinions are Right and I haven’t personally found that helpful. • Create new projects. A software engineering team isn’t a zero-sum place! The best engineers I know don’t hoard the most interesting work for themselves, they create new interesting/important work and create space for folks to do that work. For example, someone on my team spearheaded a rewrite of our deployment system which was super successful and now there’s a whole team working on new features that are way easier to build post-rewrite! • Plan your projects’ work. This is about writing down / communicating the roadmap for projects you’re working on and making sure that folks understand the plan. • Proactively communicate project risks. It’s really important to recognize when something you’re working on isn’t going well, communicate it to other engineers/managers, and figure out what to do. • Communicate successes! • Do side projects that benefit the team/company. I see a lot of senior engineers occasionally doing small high leverage projects (like building dev tooling / helping set policies) that end up helping a LOT of people get their work done a lot better. • Be aware of how projects relate to business priorities. • Decide when to stop doing a project. Figuring out when to stop / not start work on something is surprisingly hard :) I put “write code” first because I find it surprisingly easy to accidentally let that take a back seat :) One thing I left out is “make estimates”. Making estimates is something I’m still not very good at and that I don’t think I see very much of (?), but I think it could be worth spending more time on some day. This list feels like a lot and like if you tried to do all those things all the time it would consume all available brain space. I think in general it probably makes sense to carve out a subset and decide “right now I’m going to focus on X Y Z, I think my brain will explode if I try to do A B C as well”. ### What’s not part of the job This section is a bit tricky. I’m not saying that these aren’t a senior engineer’s job in the sense of “I won’t help create a good work environment on my team, how dare you suggest that’s part of my job!!“. Most senior engineers I know have spent a huge amount of time thinking about these issues and work on them quite a bit. The reason I think it’s useful to create a boundary here is that everyone I work with has a really strong sense of ownership/responsibility to the team / company (“does it need to be done? well, sure, I can do that!!“) and I think it’s easy for that willingness to do whatever needs to happen to turn into folks getting overwhelmed/overworked/unable to make the kinds of technical contributions that are actually their core job. So if you can create some boundaries around your role it’s easier to decide what sorts of work to ask for help with when things are hectic. The actual boundary you draw course depends on you / your team :) Most of these are a manager’s job. Caveats: managers do a lot more than the things listed here (for instance “create new projects”), and at some companies some of these things might actually be the job of a senior engineer (eg sprint management). • Make sure every team member’s work is recognized • Make sure work is allocated in a fair way • Make sure folks are working well together • Build team cohesion • Have 1:1s with everyone on the team • Train new managers / help them understand what’s expected of them (though I think senior ICs often actually do end up picking some of this up?) • Do project management for projects you’re not working on (where I work, that’s the job of whatever engineer is leading that project) • Be a product manager • Do sprint management / organize everyone’s work into milestones / run weekly team meetings ### Explicitly setting boundaries is useful I ran into an interesting situation recently where I was talking to a manager about which things were and weren’t part of my job as an engineer, and we realized that we had very different expectations! We talked about it and I think it’s sorted out now, but it made me realize that it’s very important to agree about what the expectations are :) When I started out as an engineer, my job was pretty straightforward – I wrote code, tried to come up with projects that made sense, and that was fine. My manager always had a clear sense of what my job was and it wasn’t too complicated. Now that’s less true! So now I view it as being more my responsibility to define a job that: • I can do / is sustainable for me • I want to do / that’s overall enjoyable & in line with my personal goals • is valuable to the team/organization And the exact shape of that job will be different for different people (not everyone has the same interests & strengths, for example I am actually not amazing at code review yet!), which I think makes it even more important to negotiate it / do expectation setting. ### Don’t agree to a job you can’t do / don’t want I think pushing back if I’m asked to do work that I can’t do or that I think will make me unhappy long term is important! I find it kind of tempting to agree to take on a lot of work that I know I don’t really enjoy (“oh, it’s good for the team!”, “well someone needs to do it!“). But, while I obviously sometimes take on tasks just because they need to be done, I think it’s actually really important for team health for folks to be overall doing jobs that are sustainable for them and that they overall enjoy. So I’ll take on small tasks that just need to get done, but I think it’s important for me not to say “oh sure, I’ll spend a large fraction of my time doing this thing that I’m bad at and that I dislike, no problem” :). And if “someone” needs to do it, maybe that just means we need to hire/train someone new to fill the gap :) ### I still have a lot to learn! While I feel like I’m starting to understand what this “senior engineer” thing is all about (7 years into my career so far), I still feel like I have a LOT to learn about it and I’d be interested to hear how other people define the boundaries of their job! ### Some possible career goals I was thinking about career goals a person could have (as a software developer) this morning, and it occurred to me that there are a lot of possible goals! So I asked folks on Twitter what some possible goals were and got a lot of answers. This list intentionally has big goals and small goals, and goals in very different directions. It definitely does not attempt to tell you what sorts of goals you should have. I’m not sure yet whether it’s helpful or not but here it is just in case :) I’ve separated them into some very rough categories. Also I feel like there’s a lot missing from this list still, and I’d be happy to hear what’s missing on twitter. ### technical goals • become an expert in a domain/technology/language (databases, machine learning, Python) • get to a point where you can drop into new situations or technologies and quickly start making a big impact • do research-y work / something that’s never been done before • satisfy your intellectual curiosity about something • get comfortable with really big codebases • work on a system that has X scale/complexity (millions of requests per second, etc) • scale a project way past its original design goals • do work that saves the company a large amount of money • be an incident commander for an incident and run the postmortem • make an contribution to an open source project • get better at some skill (testing / debugging / a programming language / machine learning) • become a core maintainer for an important OSS project • build an important system from scratch • be involved with a product/project from start to end (over several years) • understand how complex systems fail (and how to make them not fail) • be able to build prototypes quickly for new ideas ### job goals • get your first job • pass a programming interview • get your “dream job” (if you have one) • work at a prestigious company • work at a very small company • work at a company for a really long time (to see how things play out over time) • work at lots of different companies (to get lots of different perspectives) • get a raise • become a manager • get to a specific title (“architect”, “senior engineer”, “CTO”, “developer evangelist”, “principal engineer”) • work at a nonprofit / company where you believe in the mission • work on a product that your family / friends would recognize • work in many different fields • work in a specific field you care about (transit, security, government) • get paid to work on a specific project (eg the linux kernel) • as an academic, have stable funding to work towards your research interests • become a baker / work on something else entirely :) ### entrepreneurship goals This category is obviously pretty big (there are lots of start-your-own-business related goals!) and I’m not going to try to be exhaustive. • start freelancing • start a consulting company • make your first sale of software you wrote • get VC funding / start a startup • get to X milestone with a company you started ### product goals I think the difference between “technical goals” and “product goals” is pretty interesting – this area is more about the impact that your programs have on the people who use them than what those programs consist of technically. • do your work in a specific way that you care about (eg make websites that are accessible) • build tools for people who you work with directly (this can be so fun!!) • make a big difference to a system you care about (eg “internet security”) • do work that helps solve an important problem (climate change, etc) • work in a team/project whose product affects more than a million people • work on a product that people love • build developer tools ### people/leadership goals • help new people on your team get started • help someone get a job/opportunity that they wouldn’t have had otherwise • mentor someone and see them get better over time • “be a blessing to others you wished someone else was to you” • be a union organizer / promote fairness at work • build a more inclusive team • build a community that matters to people (via a meetup group or otherwise) ### communication / community goals • write a technical book • give a talk (meetup, conference talk, keynote) • give a talk at a really prestigious conference / in front of people you respect • give a workshop on something you know really well • start a conference • write a popular blog / an article that gets upvoted a lot • teach a class (eg at a high school / college) • change the way folks in the industry think about something (eg blameless postmortems, fairness in machine learning) ### work environment goals A lot of people talked about the flexibility to choose their own work environment / hours (eg “work remotely”). • get flexible hours • work remotely • get your own office • work in a place where you feel accepted/included • work with people who share your values (this involves knowing what your values are! :) ) • work with people who are very experienced / skilled • have good health insurance / benefits • make X amount of money ### other goals • remain as curious and in love with programming as the first time I did it ### nobody can tell you what your goals are This post came out of reading this blog post about how your company’s career ladder is probably not the same as your goals and chasing the next promotion may not be the best way to achieve them. I’ve been lucky enough to have a lot of my basic goals met (“make money”, “learn a lot of things at work”, “work with kind and very competent people”), and after that I’ve found it hard to figure out which of all of these milestones here will actually feel meaningful to me! Sometimes I will achieve a new goal and find that it doesn’t feel very satisfying to have done it. And other times I will do something that I didn’t think was a huge deal to me, but feel really proud of it afterwards. So it feels pretty useful to me to write down these things and think “do I really want to work at FANCY_COMPANY? would that feel good? do I care about working at a nonprofit? do I want to learn how to build software products that lots of people use? do I want to work on an application that serves a million requests per second? When I accomplished that goal in the past, did it actually feel meaningful, or did I not really care?” ### Why sell zines? Hello! As you may have noticed, I’ve been writing a few new zines (they’re all at https://jvns.ca/zines ), and while my zines used to be free (or pay-for-early-access-then-free after), the new ones are not free! They cost$10!

In this post, I want to talk a little about why I made the switch and how it’s been going so far.

### selling your work is okay

I wanted to start out by saying something sort of obvious – if you decide to sell your work instead of giving it away for free, you don’t need to justify that (why would you?). Since I’ve started selling my zines, exactly 0 people have told me “julia, how dare you sell your work”, and a lot of people have said “your work is amazing and I’m happy to pay for it! This is great!”

But I still want to talk about this because it’s been a pretty confusing tradeoff for me to think through (what are my goals? does giving things away for free or selling them accomplish my goals better?)

### what are my goals?

I don’t have a super clear set of goals with my blog / zines, but here are a few:

• expose people to new important ideas that they might never have heard of otherwise. I think in systems a lot of knowledge can be hard to get if you don’t know the right people, I think that’s very silly, and I’d like to make a small dent in that.
• explain complicated ideas in the simplest possible way (but not simpler!!!). A lot of things that seem complicated at first actually aren’t really, and I want to show people that.

### free work is easier to distribute

The most obvious advantage is that if something is free, it’s way easier for more people to access it and learn from it. For me, this is the biggest thing – I care about the impact of my writing (writing just for myself is useful, but ideally I’d like for it to help lots of people!)

A really good example of this is this article Open Access for Impact: How Michael Nielsen Reached 3.5M Readers about Michael Nielsen’s book Neural Networks and Deep Learning. 3.5M readers is probably an overestimate, but he says:

total time spent by readers is about 250,000 hours, or roughly 125 full time working years.

That’s a lot! This was the biggest reason I held off selling zines for a long time – I worried that if I sold my zines, not that many people would buy them relative to how many folks would download the free versions.

### selling zines makes it easier to spend money (and time) on it

A huge advantage of selling zines, though, is that it makes it way easier to invest in making something that’s high-quality. I’ve spent probably $5000 on tablets / printing / software / illustrators to make zines. Since I’ve made substantially more than$5000 at this point (!!!), investing in things like that is now a really easy decision! I can hire super talented illustrators and pay them a fair amount and not worry about it!

I decided earlier this year to buy an iPad (which has made drawing zines SO MUCH EASIER for me, the apple pencil is amaaazing), and instead of thinking “oh no, this is kind of expensive, should I really spend money on it?” I could just reason “this is a tool that will more than pay for itself! I should just buy it!“.

Also, the fact that I’m making money from it makes it way easier to spend time on the project – any given zine takes me weeks of evenings/weekends to make, and carving that time out of my schedule isn’t always easy! If I’m getting paid for it it makes it way easier to stay motivated to make something awesome instead of producing something kinda half-baked.

### people take things they pay for more seriously

Another reason I’m excited about selling zines is that I feel like, since I’ve started doing it and investing a little more into the quality, people have taken the project a little more seriously!

• “bite size linux” is a required text in a university course!. This is extremely delightful.
• a bunch of folks who work at various companies have bought zines to give to their coworkers/employees!

I think “this costs money” is a nice way to signal “I actually spent time on this, this is good”.

### people are actually willing to buy zines

At the beginning I said that I was worried that if I sold zines, nobody would buy them, and so nobody would learn from them, and that would be awful. Was that worry justified? Well, I actually have a little bit of data about this!! The only thing I use statistics for on this website is how many people download my zines (I run heap on https://jvns.ca/zines). Here are some stats:

• my most-downloaded zine is “so you want to be a wizard” with 5,000 clicks
• my most-bought zine is “bite size linux” with 3,000 sales (!!!)

3,000 sales is incredible (thank you everyone!!!!) and I’ve been totally blown away by how many people have bought these zines.

This actually feels like selling zines results in more people reading the zine – to me, 3,000 sales is WAY BETTER than 5,000 clicks, because I think that someone who bought a zine is probably like 4x more likely to read it than someone who just clicked on a PDF. (4x being a Totally Unscientific Arbitrary Number).

### how do you decide on pricing?

PRICING. EEP. GUYS.

I find thinking about pricing SO CONFUSING. There’s this “charge more” narrative I see a lot on the internet which basically goes:

• tie whatever you’re selling to someone else’s business outcomes
• charge them relative to how much money the product can help them make, not relative to how hard it was to build

I think this a reasonable model and it’s how things like this guide to rails performance are priced.

This is not really how I’ve been thinking about it, though – my approach right now is just to charge what I think is a reasonable/fair price, which is $10/zine. I had a super interesting conversation with Stephanie Hurlburt, though, where she argued that I should be charging more for different reasons! Her argument was: • We want to build a world where artist/educators can get paid fairly for their work •$10/zine is not actually a lot of money, it’s only sustainable for julia because julia has a big audience
• if I could figure out how to charge more, I could share that with other people and make a world where smaller creators could be more successful

I find that argument pretty compelling (I would like more people to be able to make money from selling zines!). But I don’t have any plans to charge more for individual zines than $10/zine because$10 just seems like a reasonable price to me and I know that it’s already too much for some folks, especially people in countries where their currency is a lot weaker than the US dollar.

### experimenting with corporate pricing

While I’m pretty reluctant to do experiments with the $10/zine price for individual people, experimenting with corporate pricing is a lot easier! Folks generally aren’t spending their own money, so if I raise the prices for a company to buy a zine, maybe they won’t buy it if they decide it’s too much, but it’s a lot less personal and doesn’t affect someone’s ability to read the zines in the same way. Right now, companies buy zines from me for 2 reasons: 1. to give them to their employees to teach folks useful things (I charge somewhere between$100 -> $600 for a site license right now) 2. to distribute them at conferences/other events (eg microsoft gave out zines/posters by me at a couple of conferences this year). I’ve only just started doing this but it seems like a super fun way to get more zines into the world! I have been doing some corporate pricing experiments – for Help! I have a manager! I raised the minimum price to$150 because I think it’s pretty valuable to help folks work better with their managers. We’ll see what happens!

### why not patreon?

As a sidebar – a lot of folks have suggested that I use Patreon. Right now I definitely do not want to use Patreon/other donation-based models for various reasons (though I support creators on Patreon and I think it’s great!). I don’t want to get into it in this post but maybe I’ll talk about this another time!

Basically to me the model of “pay $10 for a zine” is super simple, I like it, and I have no desire to switch to Patreon :) ### a tradeoff between free & paid: post drafts on Twitter What I’m doing right now is – I’ll post drafts of almost everything I write in my zines on Twitter. This works really well for a lot of reasons: • I get really early feedback on whether something is working or not – folks will suggest a lot of great improvements in the Twitter replies! • I get to see what’s resonating with folks – for example, this comics about 1:1s got 2.5K retweets, which is a lot! Knowing that folks found that page really useful helped me decide where to put it in the zine (near the beginning!) • people who maybe can’t afford$10 for the zine can follow along on Twitter and get all the information anyway
• obviously it’s great advertising – if people like the comics I tweet, they might decide to buy the zine later! :) And if they want to just enjoy the tweets that’s awesome too ❤

As an example, most of the pages from Help! I have a manager! are in this twitter moment.

### a few things that haven’t gone well

Not everything has been 100% amazing with selling zines on the internet! A couple of things that haven’t gone well:

• some people don’t have credit cards / PayPal and so can’t get the zine! I would really really like a good solution to this.
• Gumroad doesn’t have great email deliverability – sometimes when someone buys a zine it’ll end up in their spam. This is pretty easy to resolve (people email me to say that they didn’t get it, and it’s always easy to fix right away), but I wish they were better at this. Otherwise Gumroad is a good platform.
• On my first zine, I didn’t put my email address on the Gumroad page, so some people didn’t know how to get in touch with me when there was a problem and one person opened a dispute. Now I put my email address on Gumroad which I think has fixed that!
• I sent an update email on Gumroad to past zine buyers saying that I had a new zine out and one person replied to say that they didn’t like being emailed. I think there’s a little room for to improve here – the fact that Gumroad autoenrolls everyone who buys a zine into an “updates” email list is IMO a bit weird and it feels like it would be better if it was opt-in.
• Someone posted my blog post announcing a new zine to lobste.rs and folks commented that they didn’t think it was appropriate to post non-free things on lobste.rs. I agree with that but this seems hard to prevent though since I can’t control what people post on tech news sites :). I think this isn’t a big deal but it didn’t feel great.

I’m sure I’ll make some more mistakes in the future and hopefully I’ll learn from them :). I wanted to post these because I worry a lot about making mistakes when selling things to folks, but once I write down the issues so far they all feel very resolvable. Mostly I just try to reply to email fast when folks have problems, which isn’t that often.

### let’s see how the experiment goes!

So far selling zines feels like

• I end up with a comparable amount of readers (I think there’s not a huge difference?)
• I can make something that’s higher quality (and pay more artists to help me!). It’s way easier to justify spending time on it.
• People take the work more seriously
• Folks have been really positive and supportive about it
• It’s maybe helping a tiny bit to build a world where more folks can get paid to write really awesome educational materials

I’m excited to try out some new things in the future (hopefully printing???). I’ll try to keep writing about what I learn as I go, because how to do this really hasn’t been obvious to me. I’d love to hear what you think!

### New zine: Help! I have a manager!

I just released a new zine! It’s called “Help! I have a manager!”

This zine is everything I wish somebody had told me when I started out in my career and had no idea how I was supposed to work with my manager. Basically I’ve learned along the way that even when I have a great manager, there are still a lot of things I can do to make sure that we work well together, mostly around communicating clearly! So this zine is about how to do that.

You can get it for $10 at https://gum.co/manager-zine. Here’s the cover and table of contents: The cover art is by Deise Lino. Tons of people helped me write this zine – thanks to Allison, Brett, Jay, Kamal, Maggie, Marc, Marco, Maya, Will, and many others. ### a couple of my favorite pages from the zine I’ve been posting pages from the zine on twitter as I’ve been working on it. Here are a couple that I think are especially useful – some tips for what even to talk about in 1:1s, and how to do better at asking for feedback. ### Build impossible programs Hello! My talk from Deconstruct this year (“Build impossible programs”) is up. It’s about my experience building a Ruby profiler. This is the second talk I’ve given about building a profiler – the first one (Building a Ruby profiler) was more of a tech deep dive. This one is a squishier talk about myths I believed about doing ambitious work and how a lot of those myths turn out not to be true. There’s a transcript on Deconstruct’s site. They’re also gradually putting up the other talks from Deconstruct 2018, which were generally excellent. ### video ### slides As usual these days I drew the slides by hand. It’s way easier/faster, and it’s more fun. ### zine side note One extremely awesome thing that happened at Deconstruct was that Gary agreed to print 2300 zines to give away to folks at the conference. They all got taken home which was really nice to see :) ### An awesome new Python profiler: py-spy! The other day I learned that Ben Frederickson has written an awesome new Python profiler called py-spy! It takes a similar approach to profiling as rbspy, the profiler I worked on earlier this year – it can profile any running Python program, it uses process_vm_readv to read memory, and it by default displays profiling information in a really easy-to-use way. Obviously, think this is SO COOL. Here’s what it looks like profiling a Python program: (gif taken from the github README) It has this great top-like output by default. The default UI is somewhat similar to rbspy’s, but feels better executed to me :) ### you can install it with pip! Another thing he’s done that’s really nice is make it installable with pip – you can run pip install py-spy and have it download a binary immediately! This is cool because, even though py-spy is a Rust program, obviously Python programmers are used to installing software with pip and not cargo. In the README he describes what he had to do to distribute a Rust executable with pip without requiring that users have a Rust compiler installed. ### pyspy probably is more stable than rbspy! Another nice thing py-spy is that I believe it only uses Python’s public bindings (eg Python.h). What I mean by “public bindings” is the header files you’d find in libpython-dev. rbspy by contrast uses a bunch of header files from inside the Ruby interpreter. This is because Python for whatever reason includes a lot more struct definitions in its header files. As a result, if you compare py-spy’s python bindings to rbspy’s ruby bindings, you’ll notice that • there are way fewer Python binding files (6 vs 42 for Ruby) • each file is much smaller (~30kb vs 200kb for Ruby) Basically what I think this means is that py-spy is likely to be easier to maintain longterm than rbspy – since rbspy depends on unstable internal Ruby interfaces, even though it works relatively well today, future versions of Ruby could break it at any time. ### the start of an ecosystem of profilers in Rust?? :) One thing that I think is super nice is that rbspy & py-spy share some code! There’s this proc-maps crate that Ben extracted from rbspy and improved substantially. I think this is awesome because if someone wants to make a py-spy/rbspy-like profiler in Rust for another language like Perl or Javascript or something, it’s even easier! It turns out that phpspy is a sampling profiler for PHP, too! I have this secret dream that we could eventually have a suite of open source profilers for lots of different programming languages that all have similar user interfaces. Today every single profiling tool is different and it’s a pain. ### also rbspy has windows support now! Ben also contributed Windows support to rbspy, which was amazing, and py-spy has Windows support from the start. So if you want to profile Ruby or Python programs on Windows, you can! ### Editing my blog's HTTP headers with Cloudflare workers Hello! For the last 6 months, I’ve had a problem on this blog where every so often a page would show up like this: Instead of rendering the HTML, it would just display the HTML. Not all the time, just… sometimes. I’ve gotten a lot of messages from readers with screenshots of this, and it’s no fun! People do not want to read raw HTML. I would like my pages to render! I finally (I think) have a solution to this, so I wanted to write up what I did. ### The mystery of the missing Content-Type header It was clear basically the first time this happened that the reason was that there was a missing HTTP Content-Type header. The Content-Type for HTML pages is supposed to be set to Content-Type: text/html; charset=UTF-8. You can see this header with curl -I: $ curl -I https://jvns.ca/
HTTP/1.1 200 OK
Date: Mon, 03 Sep 2018 13:59:16 GMT
Content-Type: text/html; charset=UTF-8 <========= this one
Content-Length: 0
Connection: keep-alive
CF-Cache-Status: HIT
Cache-Control: public, max-age=3600
CF-RAY: 4548bc69fc6c3fb9-YUL
Expires: Mon, 03 Sep 2018 14:59:16 GMT
Last-Modified: Sun, 02 Sep 2018 14:21:53 GMT
Strict-Transport-Security: max-age=2592000
Vary: Accept-Encoding
Via: e4s
X-Content-Type-Options: nosniff
Server: cloudflare


But sometimes, that Content-Type header would be missing. Weird!!! The most confusing thing about this was that it happened very infrequently, and usually only on one page at a time, which made it a lot harder to debug.

I haven’t had too much energy to debug this because while I think debugging weird computer networking bugs is super fun, what I’ve been doing at work for the last while has been debugging computer networking bugs and so I’m not that motivated to do it at home too. So that’s why this has lasted for 6 months :)

### why is the Content-Type header missing?

So, why is the Content-Type header sometimes missing? I actually don’t know! My site is served by nearlyfreespeech.net and cached by Cloudflare, so it’s something in there somewhere. Either:

• my webhost is not serving a Content-Type header sometimes (which doesn’t make much sense)
• the CDN is deleting the Content-Type header (which makes even less sense)

This isn’t the first time something like this happened – in 2017, the Content-Encoding: gzip header mysteriously disappeared and I never found out why that was either. But! Even though I don’t know why this is happening and I have no visibility into it, I can still try to fix it!

### things I tried

Before talking about my latest solution that I think will work, here are some things that I tried that didn’t work:

• clearing my Cloudflare cache lots of times (this would temporarily fix the problem, but it would just crop up again later)
• upgrading to a new ‘realm’ on my webhost, in the hopes that there was a bad Apache server or something that I could move away from
• Making sure <!DOCTYPE html> was at the beginning of all my HTML in case that helped browsers figure it out it was HTML (it didn’t)
• Switching away from nearlyfreespeech’s “free beta bandwidth” program
• making a lot of curl requests to my webhost directly to see if I could reproduce it (I couldn’t)

None of these things worked. The most annoying thing about this issue is that I couldn’t reliably reproduce it, so it seemed hard to report to my web host (“hey, i have this problem periodically, but you can’t observe it, you just have to take my word that it happens, can you do something?“)

The most obvious things to try that I haven’t tried are:

• changing web hosts to S3 or Github Pages or something (changing web hosts is time consuming & annoying!)
• don’t use a CDN so that bad HTTP responses don’t get cached (for various reasons I want to keep using a CDN :) )

### what worked: Cloudflare workers

At some point someone at Cloudflare very kindly offered to help me with my weird problem, and suggested I use a new Cloudflare feature: cloudflare workers.

Basically Cloudflare Workers let you run a custom bit of Javascript on their servers for every HTTP request that modifies the HTTP response. It costs $5/month to get started. This is useful because I know that I want there always to be a Content-Type header. So if I can write some Javascript that modifies the response header if there’s no Content-Type header, I can fix this problem!!! When I woke up this morning a bunch of folks had tweeted at me saying that this problem had cropped up again. I’d made an attempt at using Cloudflare workers in the past and not quite gotten it to work, but since I was able to see the problem on my laptop (a very good thing, if I wanted to test a fix!!), I decided to give it a shot again. ### my Javascript code And this morning I got the Cloudflare workers working to fix the Content-Type header!!! So many tiny little robots making things better. Here’s my Javascript code! It basically just checks to see if the Content-Type header is missing and if so, creates a new different Response object which includes a Content-Type header. The reason I didn’t just modify the headers is that it turns out that you can’t modify the headers on a Response object, so I needed to create a new one. I also added a x-julia-test header for debugging purposes, so that I know that any response with x-julia-test got its Content-Type header edited. addEventListener('fetch', event => { event.respondWith(handleRequest(event.request)) }) /** * @param {Request} request */ async function handleRequest(request) { const response = await fetch(request) const content_type = response.headers.get("Content-Type") if (!content_type) { var headers = new Headers(); for (var kv of response.headers.entries()) { headers.append(kv[0], kv[1]); } const url = request.url console.log("Missing content type for url ", url) headers.set("Content-Type", get_content_type(url)) headers.set("x-julia-test", "edited headers!") response.headers = headers return new Response(response.body, { status: response.status, statusText: response.statusText, headers: headers}) } return response } function get_content_type(url) { if (url.endsWith(".svg")) { return "image/svg+xml" } else if (url.endsWith(".png")) { return "image/png" } else if (url.endsWith(".jpg")) { return "image/jpg" } else if (url.endsWith(".css")) { return "text/css" } else if (url.endsWith(".pdf")) { return "application/pdf" } else { return "text/html; charset=UTF-8" } }  Writing this Javascript was a pretty pleasant experience – they have what looks just like a Chrome console that you can use to run & preview your code. ### the results: it works! Right now, Cloudflare’s cached version of https://jvns.ca/blog/2018/09/01/learning-skills-you-can-practice/ is missing its Content-Type header (though this will likely have changed by the time this post goes up :)). After installing the new Content-Type header and my test x-julia-test header, here’s what it looks like when I curl the website! $ curl -I https://jvns.ca/blog/2018/09/01/learning-skills-you-can-practice/
HTTP/1.1 200 OK
Date: Mon, 03 Sep 2018 14:20:11 GMT
Content-Type: text/html; charset=UTF-8 <==== I added this one!
Connection: keep-alive
CF-Cache-Status: HIT
Cache-Control: public, max-age=3600
cf-ray: 4548db09caf63f95-YUL
etag: W/"46d9-574e4257a46f6"
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
expires: Mon, 03 Sep 2018 15:20:11 GMT
strict-transport-security: max-age=2592000
vary: Accept-Encoding
via: e2s
x-content-type-options: nosniff
Server: cloudflare


And if I load that page in Firefox, I can see that the headers got edited by my Cloudflare worker (see the x-julia-test header at the bottom). Neat!

And, most importantly, the website displays properly instead of being a bunch of raw HTML, which was the point. Amazing!

### logging the HTTP requests & responses

I also tried adding some logging to the request workers by just making a server somewhere else that logs all POST requests made to it, following the (instructions here).

I’m now logging all the requests & responses when the workers see a 200 that’s missing a Content-Type header. (There are also some 304 responses missing a Content-Type header, but that’s normal!). It hasn’t turned up anything yet, but maybe something will appear eventually!

### cloudflare workers are neat

I usually don’t talk about paid services on this blog and these workers definitely aren’t free (they charge $5/month for up to 10 million requests/month). But this was useful to me and I think it’s really cool to be able to write arbitrary Javascript code that modifies all of my blog’s HTTP responses! It’s definitely a hack – running custom javascript on every single HTTP request is an extremely silly way to fix what is probably some kind of server configuration issue somewhere. But it helps me fix my problem until I decide to spend the time to migrate web hosts or whatever, so I’m happy with that. Paying$60/year is definitely worth it to me to fix the problem & not have to spend the time to migrate to a different host right now :)

Looking at the CDN landscape in general, Fastly offers a seemingly similar feature called the Edge SDK that lets you write VCL (“varnish configuration language”). I haven’t used that though.

### Who pays to educate developers?

I’ve been thinking about developer education (and, specifically, education of professional developers who have been working for a few years already) for the last year or so. In my last post I talked about how to teach yourself hard things, which is how I’ve learned most things.

But! Even when you’re learning on your own, there are all kinds of resources you depend on! Some examples of places I’ve learned things are:

• a few really great programming books
• conference talks
• hundreds of blog posts (I subscribe to dozens of programming blogs)
• meetups
• Slack groups

All of these things (tweets, blog posts, conference talks, etc) take time to make, and a lot of it is given away for free. So who pays for all of this work? Here’s a rough taxonomy! If you have more to add to it (or examples of people you think are doing great education work that fits into these categories!), I’d love to hear them on twitter.

### companies with a product to sell

One common way to get paid money to teach people about programming is to become a “developer advocate”. I think this is a pretty cool thing! Basically, a lot of companies have realized that a good way to sell tech products is to explain the tech concepts behind their products to people in a way that they can actually understand. A great example of this is Google Cloud and Kubernetes – a lot of Google developer advocates will write blog posts / give great talks explaining Kubernetes. And those talks are often really helpful whether or not you end up using Google products!

But what Google gets out of it is – if more people understand Kubernetes, then as a side effect they also understand Google’s Kubernetes-as-a-service platform, and they’re likely to be more excited about the advantages of using it.

Personally I think this is great – developer advocates are often great programmers and great teachers, they get paid to do something that they care about, and they get a lot of free and high-quality information into the world about various complicated tech things. Awesome!

There are some downsides though, for example Google Cloud developer advocates obviously will focus on subjects that are somehow related to Google Cloud :)

### individual people who get paid in exposure

This is the category most personal blog posts / conference talks fall into. The economics of this are – you put together some great blog posts / talks, and maybe folks in your industry now recognize/respect you and are more likely to want to hire you!

My original motivation for starting this blog, 5 years ago, was I wanted to get a better job than the job I’d had before. I posted a lot of my articles to hacker news to try to get readers. And I think it helped! In any case I do have a way better job now :)

Obviously those aren’t still my motivations – probably the main reason I keep writing here is that I find it rewarding when people tell me that my blog posts helped them learned something. But that’s not the only reason! some side effects are:

• It’s easier for me to get answers to questions I have about tech
• I know it’ll be a little easier for me to get interviews for future jobs, which is reassuring
• giving talks at conferences helped me build a network of folks I can ask questions / learn about the industry from

and all that is pretty useful to my career! For example, just last week someone who had read my blog emailed me out of the blue about a super interesting job and we had an awesome conversation and I learned something new about the kinds of jobs that exist in computer networking! That would definitely not have happened if I didn’t blog about what I was learning :)

So blogging / speaking in tech is a long-term investment in your future job opportunities and it can pay off!

### companies who make money through recruiting

I talk about the Recurse Center all the time. They don’t produce educational materials directly, but they’re one of the most interesting places to level up as a developer that I know. It’s free to attend and they make money through recruiting. They’re how I got my current job, and the company that hired me paid them 25% of my first year’s salary. I didn’t pay them anything.

(as an aside, I recommend recruiting through RC if you want to hire people who are good at learning – you find out more at https://www.recurse.com/hire)

### companies who sell education to developers

The next bucket is companies who sell educational materials to developers directly.

various examples of this that I think are kind of interesting:

• Linux Weekly News, which offers a $7/month subscription to get the latest articles. I really recommend subscribing. It’s great. • Launch School has a$200/month class with the aim of getting you a way better software job.
• the School for Poetic Computation, which is a cool school in NYC in the intersection of art & tech. it costs $5000 or so for a 10-week class. • egghead.io, a set of Javascript video tutorials,$40/month
• O’Reilly’s books & videos (like safari).
• Lynda, Udacity, Udemy, Coursera all have online courses
• all the various coding bootcamps

### individuals who sell education to developers

I’m breaking this out from “companies who sell education to developers” because it seems like these businesses are differently flavoured. O’Reilly/Lynda/Udacity/Udemy sell information about basically everything related to programming. Usually individual people have a much narrower focus, which is cool.

To me, self publishing definitely falls into this category! In 2018, it seems like a much more viable way to actually make money from teaching than traditional publishing.

### programming books

I’m going to mostly not talk about programming books for traditional publishers because even though they’re really important, they seem to live in a complicated place between “writing for free for exposure” and “making money” that I don’t fully understand. For instance in the economics of writing a technical book the author says he made about $23/hour for 500 hours of work. I don’t know if that’s typical. If people are actually making money at rates better than$20/hourish from publishing programming books with traditional publishers, I’d be curious to know about that! This is not something I know a lot about yet.

### sell training to companies

Selling training to companies is a really logical pattern – an individual might not be willing to pay $2000 for a class, but a company might very well be willing to pay$2000/person for a 10-person in-person class!

here are some examples I know about in that area:

There are probably a TON more here that I don’t know about.

### should I be paying for more learning materials?

So! Reflecting on this a bit, the categories we’ve seen are:

1. devs who want to learn pay (to invest in their knowledge)
2. devs who want to teach pay (to share knowledge / build their network / build a reputation for being an expert)
3. companies pay (to educate their employees)
4. companies selling a product pay (to educate their future customers)

Basically everything that’s free lives in either category #2 or #4, which is most of what I read. Is that really what I want to be doing, though? As much as I ADORE all the bloggers I read I feel like it’s kind of weird that I mostly learn from free sources, and the incentive structures there aren’t that well aligned with producing really excellent learning materials.

One of my favourite sources recently to learn from has been the book the linux programming interface, which is not free (it’s \$70 or so). And it’s a MUCH more reliable and useful and efficient source than reading Stack Overflow answers about Linux. But not all books I’ve bought have been consistently an excellent use of time to read, so I find this a bit tricky.