<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Streak Engineering]]></title><description><![CDATA[Technical writings from the Streak engineering team]]></description><link>https://engineering.streak.com</link><image><url>https://substackcdn.com/image/fetch/$s_!lqS3!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0de1365-e7b3-43d0-8c30-c90702f35b5a_512x512.png</url><title>Streak Engineering</title><link>https://engineering.streak.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 28 Apr 2026 12:03:22 GMT</lastBuildDate><atom:link href="https://engineering.streak.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Rewardly, Inc.]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[streakengineering@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[streakengineering@substack.com]]></itunes:email><itunes:name><![CDATA[Blake Kadatz]]></itunes:name></itunes:owner><itunes:author><![CDATA[Blake Kadatz]]></itunes:author><googleplay:owner><![CDATA[streakengineering@substack.com]]></googleplay:owner><googleplay:email><![CDATA[streakengineering@substack.com]]></googleplay:email><googleplay:author><![CDATA[Blake Kadatz]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Preventing Flash of Incomplete Markdown when streaming AI responses]]></title><description><![CDATA[You&#8217;ve heard of Flash of Unstyled Content (FOUC) before, where you see unstyled HTML appear, then the CSS loads and the browser suddenly updates to display the correct style. A similar problem occurs when streaming AI responses and there's a simple solution for it.]]></description><link>https://engineering.streak.com/p/preventing-unstyled-markdown-streaming-ai</link><guid isPermaLink="false">https://engineering.streak.com/p/preventing-unstyled-markdown-streaming-ai</guid><dc:creator><![CDATA[Blake Kadatz]]></dc:creator><pubDate>Wed, 04 Jun 2025 17:00:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/48150aa8-d6ca-440a-bf21-9a2691fade39_420x300.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You&#8217;ve heard of <a href="https://en.wikipedia.org/wiki/Flash_of_unstyled_content">Flash of Unstyled Content (FOUC)</a> before, where you see unstyled HTML appear, then the CSS loads and the browser suddenly updates to display the correct style.</p><p>A similar problem exists when streaming responses generated by AI that I call &#8220;Flash of Incomplete Markdown&#8221; (FOIM). I&#8217;ve reproduced this within OpenAI&#8217;s playground by throttling my connection speed to dialup internet speeds:</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;06e9ed29-3b24-42b1-80f6-d8e047c4fde9&quot;,&quot;duration&quot;:null}"></div><p>While this is greatly exaggerated due to the slow speed, you can see the incomplete markdown appear in chunks. This occurs because OpenAI&#8217;s streaming API returns an event stream where it builds up a response message. These chunks are provided by what are called output text deltas:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mzwt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mzwt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png 424w, https://substackcdn.com/image/fetch/$s_!Mzwt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png 848w, https://substackcdn.com/image/fetch/$s_!Mzwt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png 1272w, https://substackcdn.com/image/fetch/$s_!Mzwt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mzwt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png" width="1456" height="876" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d6470623-7990-420e-b768-051d196dfb7e_1466x882.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:876,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:496467,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/164603668?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Mzwt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png 424w, https://substackcdn.com/image/fetch/$s_!Mzwt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png 848w, https://substackcdn.com/image/fetch/$s_!Mzwt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png 1272w, https://substackcdn.com/image/fetch/$s_!Mzwt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6470623-7990-420e-b768-051d196dfb7e_1466x882.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You can read more about output text deltas in their <a href="https://platform.openai.com/docs/api-reference/responses-streaming/response/output_text/delta">API docs</a>.</p><p>I&#8217;ve seen this same behavior occur in a number of commercially launched products, so this isn&#8217;t some obscure effect. We experienced this in our product as well. One of the AI features Streak offers is the ability to ask a question about one of your deals, where the answer may be in any related email threads, various comments, meeting notes, and so on. </p><p>You could ask &#8220;When did the customer sign the contract?&#8221; and getting back an answer of &#8220;Last Thursday&#8221; is useful, but even more useful is providing a link to the email thread from Thursday where they sent an email saying &#8220;Signed contract attached&#8221;. The Streak user can click on that and be confident that the answer is correct. You can see that here as I was implementing the functionality in my dev environment:</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;1d238991-8108-470c-b9cb-96fd0757ca58&quot;,&quot;duration&quot;:null}"></div><p>What should have been output as trivial <code>[source]</code> links ended up rendering the incomplete markdown using a lengthy link format of <code>https://streak.com/a/boxes/{KEY}/itemtype/{KEY}</code> and only when the browser received the closing <code>)</code> of the markdown link did the content collapse to show just <code>[source]</code>.</p><h3>Hallucinations</h3><p>Additionally, when we launched this internally to Streak employees someone reported that one of the links was incorrect. It turns out that OpenAI was hallucinating one of the URLs we provided as the citation by combining the first part of the box key with the last part of the comment key. The two keys had a common prefix and OpenAI returned a mangled URL. This resulted in a 404 error since the hallucinated link was invalid.</p><p>While the incomplete markdown issue is somewhat annoying, providing the user with incorrect links is unacceptable. So I set out to solve the hallucination issue.</p><h3>Really short links</h3><p>What if instead of <code>https://streak.com/a/boxes/{KEY}/itemtype/{KEY}</code> where the keys are lengthy (the URL can be over 200 characters long), the link was simply &#8220;#REF3&#8221;? Due to the tiny amount of tokens that takes, it&#8217;d be significantly less likely that OpenAI would hallucinate by combining unrelated parts of the output due to some shared common prefix. We decided to change the link text to output Wikipedia style numbers, where the links would get output as:</p><p><code>[1](#REF3)</code></p><p>and so on for each reference<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. This would work great, and we could simply keep track of which short references map to which original links and replace them on the fly. But where would we perform this substitution? The server streams the output text deltas from OpenAI&#8217;s API to the client and each delta is completely arbitrary, where one delta could be &#8220;#&#8221;, the next delta might be &#8220;REF&#8221;, with another delta of just &#8220;3&#8221;. Would we need to provide the client with the complete mapping ahead of time so it could replace the assembled markdown as it&#8217;s received? This gets complicated as we may have dozens of citations and the answer may only need to cite one of them.</p><h3>Bring on the state machine</h3><p>It turns out there&#8217;s an easier, far less complicated solution where we can dynamically detect the start of a markdown link and begin buffering the output on the server, not sending anything to the client until the link has completed. This is simple to do via a state machine. The four states are <code>TEXT</code>, <code>LINK_TEXT</code>, <code>EXIT_LINK_TEXT</code>, and <code>LINK_URL</code> with the following behaviors:</p><ol><li><p>Regardless of state, the <code>\</code> character acts as an escape character, so we output the next character without any additional consideration.</p></li><li><p>Start out in <code>TEXT</code> state, outputting normal text that&#8217;s not a link, streaming tokens to the client.</p></li><li><p>Beginning of markdown link via <code>[</code> character: transition to <code>LINK_TEXT</code> state and still stream tokens to the client.</p></li><li><p>If in <code>LINK_TEXT</code> state and there&#8217;s a matching end of markdown link via <code>]</code> character<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>: transition to <code>EXIT_LINK_TEXT</code> state and still stream tokens to the client.</p></li><li><p>If in <code>EXIT_LINK_TEXT</code> state and next character is <code>(</code>, transition to <code>LINK_URL</code> state and begin buffering the URL <em>without</em> streaming anything to the client. Otherwise, if the next character is anything else transition back to the <code>TEXT</code> state and resume streaming.</p></li><li><p>If in <code>LINK_URL</code> state and there&#8217;s a matching end of the link via <code>)</code> character<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>, see if the buffered URL exists as a key in our URL map and, if it is, replace the URL with the full URL. Then output the URL to the client and transition back to the default <code>TEXT</code> state.</p></li></ol><p>Since we are asking OpenAI to provide citations using a specific format, this simple processing is sufficient for our needs. The markdown spec is surprisingly robust<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>, allowing additional formats for links such as:</p><ul><li><p>[link](/uri "title")</p></li><li><p>[link](&lt;/my uri&gt;)</p></li><li><p>[a](&lt;b)c&gt;)</p></li></ul><p>I haven&#8217;t implemented support for the full spec since we have well known formats for our link text and URLs, but should the need arise the state machine can easily be extended. Until then, YAGNI.</p><h3>Improved output</h3><p>You can see the result of implementing the above:</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;e3f2d38c-b36c-45bd-976c-90ef555a09f8&quot;,&quot;duration&quot;:null}"></div><p>No more raw link URLs being flashed to the user. Just nice, clean links that appear as the full URL has been processed. You can see in the event stream that the server streams each output text delta as it&#8217;s received from OpenAI, except if there&#8217;s a link where the server buffers the link, replaces the short URL with the full URL, and sends the complete link to the client in a single chunk:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mkfs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mkfs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png 424w, https://substackcdn.com/image/fetch/$s_!mkfs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png 848w, https://substackcdn.com/image/fetch/$s_!mkfs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png 1272w, https://substackcdn.com/image/fetch/$s_!mkfs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mkfs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png" width="1135" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1135,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:145792,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/164603668?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mkfs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png 424w, https://substackcdn.com/image/fetch/$s_!mkfs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png 848w, https://substackcdn.com/image/fetch/$s_!mkfs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png 1272w, https://substackcdn.com/image/fetch/$s_!mkfs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910485c2-c52e-4fa6-9217-5b48869b62e4_1135x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Just plain better</h3><p>There are multiple benefits from taking this approach:</p><ol><li><p>Instead of sending what ended up being links which consume approximately 50 tokens, our short links are 3 tokens. Fewer tokens means a smaller context for OpenAI and a lower usage bill.</p></li><li><p>Link hallucinations appear to be a thing of the past, which was the catalyst for making this change.</p></li><li><p>The streaming response from OpenAI is now much faster since it&#8217;s providing the response with our short URLs of only 3 tokens and the speed difference is noticeable to the user.</p></li><li><p>Since we buffer the URLs server-side and only provide the completed markdown link once we reach the end of the markdown URL, we are preventing flashes of incomplete markdown even if the link isn&#8217;t our short reference link. All links benefit from this buffering approach.</p></li><li><p>While our citation URLs don&#8217;t reveal sensitive information and are secured by the user&#8217;s Gmail account, there&#8217;s an additional privacy bonus in that this substitution never transmits the URL to OpenAI in the first place. For some organizations, this alone could be a win.</p></li></ol><p>Faster, better, and cheaper &#8212; can&#8217;t go wrong with that!</p><h3>Engineering at Streak</h3><p>We work on many interesting challenges affecting millions of users and many terabytes of data. For more information, visit <a href="https://www.streak.com/careers">https://www.streak.com/careers</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://engineering.streak.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Streak Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>The format of the link reference is completely arbitrary. I had initially wanted to go with just a link URL of &#8220;1&#8221;, &#8220;2&#8221;, and so on, but OpenAI started getting confused between the number for the URL and the number it was outputting as the citation text, since not every URL is used. For example, the first citation [1] could be the link URL of &#8220;3&#8221; or &#8220;84&#8221;, and OpenAI couldn&#8217;t keep the numbers straight. </p><p>Additionally, I prefixed the URLs with &#8220;#&#8221; just in case there is ever a bug in replacing the links since linking a &#8220;#REF3&#8221; fragment will do nothing, versus a &#8220;REF3&#8221; link would be interpreted as a relative URL based on the current path, resulting in a 404 error.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Markdown is slightly complicated by the fact that link text can be surrounded by matching brackets. So <code>[example](https://example.com)</code> is a valid regular link, but so is <code>[example [a]](https://example.com)</code>, so matching balanced <code>[]</code> characters is important.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Similar to matching <code>[]</code> the link URL can also have matching <code>()</code> characters.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>You can see the CommonMark spec at <a href="https://spec.commonmark.org/0.31.2/#links">https://spec.commonmark.org/0.31.2/#links</a></p></div></div>]]></content:encoded></item><item><title><![CDATA[Cutting 95th percentile latency from 3 minutes to 500ms]]></title><description><![CDATA[or: Every Email, Everywhere, All At Once]]></description><link>https://engineering.streak.com/p/cutting-95th-percentile-latency-from</link><guid isPermaLink="false">https://engineering.streak.com/p/cutting-95th-percentile-latency-from</guid><dc:creator><![CDATA[Blake Kadatz]]></dc:creator><pubDate>Fri, 11 Apr 2025 19:30:54 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a20acfef-d154-419c-821f-33ab8501bb2d_420x300.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Imagine a haystack with 60 billion pieces of hay and you need to find the strands that are related. In our case, the haystack is digital and contains the metadata for 60 billion emails which we manage securely but also need to make performant. For a while, we had a few API endpoints where the latency spiked, causing some requests to timeout after 3 minutes:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KIXd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KIXd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png 424w, https://substackcdn.com/image/fetch/$s_!KIXd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png 848w, https://substackcdn.com/image/fetch/$s_!KIXd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png 1272w, https://substackcdn.com/image/fetch/$s_!KIXd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KIXd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png" width="639" height="560" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:560,&quot;width&quot;:639,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:158045,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KIXd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png 424w, https://substackcdn.com/image/fetch/$s_!KIXd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png 848w, https://substackcdn.com/image/fetch/$s_!KIXd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png 1272w, https://substackcdn.com/image/fetch/$s_!KIXd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0431ac3d-05f6-422d-8ec7-610de1046625_639x560.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The average latency was a lot less but still spiked into the tens of seconds. After doing numerous optimizations and discovering a nice algorithm to avoid expensive read-time queries, the same API endpoints were vastly improved:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Tp28!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Tp28!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png 424w, https://substackcdn.com/image/fetch/$s_!Tp28!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png 848w, https://substackcdn.com/image/fetch/$s_!Tp28!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png 1272w, https://substackcdn.com/image/fetch/$s_!Tp28!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Tp28!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png" width="1089" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:1089,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Tp28!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png 424w, https://substackcdn.com/image/fetch/$s_!Tp28!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png 848w, https://substackcdn.com/image/fetch/$s_!Tp28!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png 1272w, https://substackcdn.com/image/fetch/$s_!Tp28!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe500a3a3-9827-4421-9741-39d9a0b6533a_1089x562.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The graph almost looks like things stopped working halfway through the X axis, but zooming in on the data:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6AnD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6AnD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png 424w, https://substackcdn.com/image/fetch/$s_!6AnD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png 848w, https://substackcdn.com/image/fetch/$s_!6AnD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png 1272w, https://substackcdn.com/image/fetch/$s_!6AnD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6AnD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png" width="426" height="517" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:517,&quot;width&quot;:426,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6AnD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png 424w, https://substackcdn.com/image/fetch/$s_!6AnD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png 848w, https://substackcdn.com/image/fetch/$s_!6AnD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png 1272w, https://substackcdn.com/image/fetch/$s_!6AnD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b19cf9-1ed9-4511-b567-e0a8c9b8b475_426x517.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You can see that p95 latency is around the 500ms mark, and average latency about half that. While there are still occasional spikes, they are on the order of seconds, not minutes and return the expected data rather than timing out.</p><p>In this post I&#8217;ll dive into the problem, the old way of doing things, various infrastructure changes, and how the &#8220;threadweaver&#8221; algorithm works in detail.</p><h3>Quick background</h3><p>Streak&#8217;s CRM keeps track of boxes in a pipeline. A box is something like a sales deal or a project. Users can keep track of emails inside these boxes &#8212; we call that a &#8220;boxed thread&#8221; or &#8220;boxing an email&#8221;. Boxed email threads allow users to share certain emails with all users on their team. It looks like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iz4v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iz4v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png 424w, https://substackcdn.com/image/fetch/$s_!iz4v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png 848w, https://substackcdn.com/image/fetch/$s_!iz4v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png 1272w, https://substackcdn.com/image/fetch/$s_!iz4v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iz4v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png" width="1456" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9774de49-7239-42a0-8786-b7984675e571_1567x657.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:220360,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9774de49-7239-42a0-8786-b7984675e571_1567x657.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iz4v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png 424w, https://substackcdn.com/image/fetch/$s_!iz4v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png 848w, https://substackcdn.com/image/fetch/$s_!iz4v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png 1272w, https://substackcdn.com/image/fetch/$s_!iz4v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feedbec9e-41d6-4ead-9a1b-d63e7913fd1f_1567x657.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Streak sample demo data showing box with email threads</figcaption></figure></div><p>The timeline on the left shows changes to the box including two email threads consisting of an example conversation from our own demo account.</p><p>We also support &#8220;autoboxing&#8221; &#8212; in the above example, you can specify that you want Streak to include all emails from the customer&#8217;s domain in the box. When enabled, any email sent to or from the customer from any user on your team (subject to each user&#8217;s sharing permission) will also be included in the box, automatically.</p><h3>Unifying with TiDB and JGraph</h3><p>Let&#8217;s say an email thread has 2 messages sent to 10 members of your team. These messages get delivered into everyone&#8217;s inboxes:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2x7v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2x7v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png 424w, https://substackcdn.com/image/fetch/$s_!2x7v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png 848w, https://substackcdn.com/image/fetch/$s_!2x7v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png 1272w, https://substackcdn.com/image/fetch/$s_!2x7v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2x7v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png" width="1456" height="475" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f707ab1-5163-4991-9e59-4a913d34212c_1568x512.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:475,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1453240,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f707ab1-5163-4991-9e59-4a913d34212c_1568x512.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2x7v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png 424w, https://substackcdn.com/image/fetch/$s_!2x7v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png 848w, https://substackcdn.com/image/fetch/$s_!2x7v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png 1272w, https://substackcdn.com/image/fetch/$s_!2x7v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e6d9d20-5d8e-4f0b-a90a-8568cf2960e0_1568x512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Two messages in each of the ten users&#8217; mailboxes</figcaption></figure></div><p>When you&#8217;re viewing the box timeline, you just want to see the one email thread, not the 10 copies from each person&#8217;s inbox.</p><p>So we unify all these individual threads together and only show the unique messages across all members of your team who have shared their emails. Since each email has a Message-ID header uniquely<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> identifying it, we can deduplicate the emails regardless of whose inbox they are in.</p><p>This was done via a breadth first search using a simple SQL query for the set of users identified by gid (their Google-assigned ID) on the team:</p><pre><code><code>SELECT DISTINCT m2.gid, m2.thread_id FROM messages m1
JOIN messages m2 USING (md5_rfc_message_id) 
WHERE (m1.gid, m1.thread_id) IN (:initialSet) AND m2.gid IN (:gidList)</code></code></pre><p>The query runs on our TiDB cluster and the data is all nicely indexed so, for most Streak users, this was quite fast. When this query returns threads that weren&#8217;t in the initial set, we iterate on those until we get no new threads.</p><p>You can <a href="https://www.db-fiddle.com/f/wG14Zk9SXn2eBVdYA3xfuT/2">try it here</a>. The query starts out with the initial set consisting of gid 1 and thread 1, and running it also finds gid 2 and thread 2. Plugging (2, 2) back into the query then also finds gid 3 and thread 3. Obviously a greatly simplified example, but it demonstrates how the query searches based on matching IDs.</p><p>The results are fed into JGraph where we construct a graph representation of all the emails we found and the relationships between them all. This is also where we filtered messages by the user&#8217;s custom permissions.</p><h3>Must be dynamic</h3><p>One of the reasons we used the above SQL query is that when we unify, the result needs to be dynamic. The unification should reflect changes in their team membership without having to perform lengthy recalculations to reunify the emails. Here&#8217;s a simple example where this comes into play:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x8si!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x8si!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png 424w, https://substackcdn.com/image/fetch/$s_!x8si!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png 848w, https://substackcdn.com/image/fetch/$s_!x8si!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png 1272w, https://substackcdn.com/image/fetch/$s_!x8si!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x8si!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png" width="328" height="191" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:191,&quot;width&quot;:328,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6613,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x8si!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png 424w, https://substackcdn.com/image/fetch/$s_!x8si!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png 848w, https://substackcdn.com/image/fetch/$s_!x8si!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png 1272w, https://substackcdn.com/image/fetch/$s_!x8si!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa21d3711-cbeb-48ef-91bb-b7ed657cb353_328x191.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Alice and Charlie have no shared emails</figcaption></figure></div><p>In the image above, Alice has a thread of emails with message ids of 1, 2, 3 and Charlie has a different thread of emails with message ids 5, 6, 7. Because none match we can&#8217;t unify them into a single thread. However, let&#8217;s see what happens when Bob joins the team:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7osD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7osD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png 424w, https://substackcdn.com/image/fetch/$s_!7osD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png 848w, https://substackcdn.com/image/fetch/$s_!7osD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png 1272w, https://substackcdn.com/image/fetch/$s_!7osD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7osD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png" width="325" height="188" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:188,&quot;width&quot;:325,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7378,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7osD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png 424w, https://substackcdn.com/image/fetch/$s_!7osD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png 848w, https://substackcdn.com/image/fetch/$s_!7osD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png 1272w, https://substackcdn.com/image/fetch/$s_!7osD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c9b4ab4-2a5d-487f-b3bc-7b6d6be605c1_325x188.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Bob joins with shared emails</figcaption></figure></div><p>In the image above, Bob has a thread where message id 3 matches an email that Alice has. Similarly, message id 5 matches an email that Charlie has. These previously separate threads can now be unified together into a common thread.</p><p>You can <a href="https://www.db-fiddle.com/f/5Ud9NE8mEniceqaZY9mtSj/0">try it here</a>. Note that once we insert Bob&#8217;s message metadata, we can now connect Alice to Charlie when searching.</p><h3>Gmail quirks</h3><p>Gmail has a few interesting quirks that makes the volume of emails increase substantially in a few cases:</p><ul><li><p>messages with the same subject sent within a few days of each other get threaded together, even if there is no relationship between the emails within the thread. We&#8217;ve seen this a lot ourselves for automated emails and it&#8217;s great for cutting down the number of threads in a person&#8217;s inbox.</p></li><li><p>when a thread has 100 messages, Gmail automatically creates a new thread. For long-running discussions, this ends up creating a bunch of new threads. We informally refer to this as the &#8220;multi-centi-thread problem&#8221; since each person receiving the emails (could potentially be everyone in the company) will have numerous 100 email threads all tied together.</p></li></ul><h3>Performance impact</h3><p>Small teams saw results very quickly when they went to view a shared email thread. We could quickly identify and unify all threads across their team. But when there were a large number of emails across large teams, the Gmail quirks started causing huge performance issues as shown earlier:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rqb9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rqb9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png 424w, https://substackcdn.com/image/fetch/$s_!Rqb9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png 848w, https://substackcdn.com/image/fetch/$s_!Rqb9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png 1272w, https://substackcdn.com/image/fetch/$s_!Rqb9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rqb9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png" width="639" height="560" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b126073b-587f-4a62-86a2-699c49f62466_639x560.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:560,&quot;width&quot;:639,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:158045,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Rqb9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png 424w, https://substackcdn.com/image/fetch/$s_!Rqb9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png 848w, https://substackcdn.com/image/fetch/$s_!Rqb9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png 1272w, https://substackcdn.com/image/fetch/$s_!Rqb9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb126073b-587f-4a62-86a2-699c49f62466_639x560.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Why is there a flat line at around the 180 second mark? That&#8217;s the limit we imposed on the length of a database connection, after which the database server gives up and returns an error.</p><p>Additionally, for large volumes of email when the database did return results, on occasion putting that data into JGraph could result in an Out of Memory error.</p><p>Practically speaking, for larger teams with a huge volume of emails they would click into a box that had a number of autoboxed threads, go and grab a coffee while they waited for the email data to load, then come back just to see that it returned nothing because the database timed out. Not a great experience!</p><h3>Results were slightly incorrect</h3><p>We also took a number of shortcuts in our implementation. In particular, keeping track of stats for these boxed threads could occasionally be inaccurate. By stats I mean things like the timestamp of the last sent or received email. Users use this to create views on their data like &#8220;show me all boxes where the last email was more than 3 days ago&#8221; or &#8220;show me boxes with a reply in the last week&#8221;.</p><p>Whenever we process a new email for a user, we see if that thread is part of an existing boxed thread. Doing this with full unification using the breadth-first database query was a non-starter due to the potential for heavy load. So we cheated: when processing emails, we ignored processing the email thread unless it was for the one user who boxed the thread.</p><p>Let&#8217;s say Alice, Bob, and Charlie are all participating on one email thread that was boxed by Charlie. Alice sends a reply to everyone, so there&#8217;s now an email in Alice&#8217;s sent folder plus emails in Bob and Charlie&#8217;s inbox, all with the same Message-Id. When we process Alice&#8217;s email, she didn&#8217;t box the thread so we ignore it. When we process Bob&#8217;s email, he didn&#8217;t box the thread either so we ignore that too. Finally we process Charlie&#8217;s email and since he was the one who boxed the thread, we update the stats and the last email timestamp accurately reflects the latest email:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cvgs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cvgs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png 424w, https://substackcdn.com/image/fetch/$s_!cvgs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png 848w, https://substackcdn.com/image/fetch/$s_!cvgs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png 1272w, https://substackcdn.com/image/fetch/$s_!cvgs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cvgs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png" width="1072" height="512" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b72c1b25-d389-4254-ad23-321aceafab85_1072x512.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28498bf7-ce3f-4b0a-9655-0841de2c1870_1072x512.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:512,&quot;width&quot;:1072,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:998031,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28498bf7-ce3f-4b0a-9655-0841de2c1870_1072x512.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cvgs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png 424w, https://substackcdn.com/image/fetch/$s_!cvgs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png 848w, https://substackcdn.com/image/fetch/$s_!cvgs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png 1272w, https://substackcdn.com/image/fetch/$s_!cvgs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb72c1b25-d389-4254-ad23-321aceafab85_1072x512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Processing Alice, Bob, and Charlie&#8217;s emails</figcaption></figure></div><p>This works for almost all emails, but where it goes wrong is if the person who boxed the thread no longer participates in the emails. Since we are ignoring everyone who didn&#8217;t box the thread, the stats never update and this impacted users&#8217; workflows:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r8qT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r8qT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png 424w, https://substackcdn.com/image/fetch/$s_!r8qT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png 848w, https://substackcdn.com/image/fetch/$s_!r8qT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png 1272w, https://substackcdn.com/image/fetch/$s_!r8qT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r8qT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png" width="938" height="512" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/971220fd-9334-47fc-a9dd-907fca946ce3_938x512.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:512,&quot;width&quot;:938,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:695066,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971220fd-9334-47fc-a9dd-907fca946ce3_938x512.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r8qT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png 424w, https://substackcdn.com/image/fetch/$s_!r8qT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png 848w, https://substackcdn.com/image/fetch/$s_!r8qT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png 1272w, https://substackcdn.com/image/fetch/$s_!r8qT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9642d2d4-1b3b-4c5f-b8a5-c30babee390e_938x512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Incorrect stats when Charlie removed from conversation</figcaption></figure></div><p>Because  Charlie was removed as a recipient, and he was the one who boxed the thread, we never updated the stats since the remaining emails we process didn&#8217;t correspond with Charlie&#8217;s boxed thread and were ignored.</p><h3>Finding a better algorithm</h3><p>When you distill all this down from gids, thread ids, and message ids, there are a few simple theorems we can derive:</p><ol><li><p>since we are unifying entire threads, all messages in a single thread are part of the same unified thread</p></li><li><p>when two messages in two different threads have the same Message-ID, they are part of the same unified thread</p></li></ol><p>So there are really only two inputs we need to concern ourselves with: the thread id and the message id. If we can turn these into a unified thread id and know which individual user threads make up that unified thread, then it becomes a trivial lookup.</p><p>We can represent this as the following relationships, where &#8220;ut&#8221; is a unified thread:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;threadId \\rightarrow ut&quot;,&quot;id&quot;:&quot;GFDHFALLFQ&quot;}" data-component-name="LatexBlockToDOM"></div><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;messageId \\rightarrow ut&quot;,&quot;id&quot;:&quot;MZIZEVPDMN&quot;}" data-component-name="LatexBlockToDOM"></div><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;ut \\rightarrow \\{threadId\\}&quot;,&quot;id&quot;:&quot;AXGCIZEPVW&quot;}" data-component-name="LatexBlockToDOM"></div><p>Each threadId and messageId maps to a single unified thread, but a unified thread maps to the set of threadIds that reference it. Let&#8217;s model this using a simple scenario where <code>rfc</code><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> means the Message-ID value:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nCow!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nCow!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png 424w, https://substackcdn.com/image/fetch/$s_!nCow!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png 848w, https://substackcdn.com/image/fetch/$s_!nCow!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png 1272w, https://substackcdn.com/image/fetch/$s_!nCow!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nCow!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png" width="689" height="424" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/96a23cad-736f-4ef1-baa1-8a962b91e8cb_689x424.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:424,&quot;width&quot;:689,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42337,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a23cad-736f-4ef1-baa1-8a962b91e8cb_689x424.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nCow!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png 424w, https://substackcdn.com/image/fetch/$s_!nCow!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png 848w, https://substackcdn.com/image/fetch/$s_!nCow!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png 1272w, https://substackcdn.com/image/fetch/$s_!nCow!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ebdb9b9-bd36-4d0a-9e22-744176cdd984_689x424.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Emails forwarded from user to user</figcaption></figure></div><p>This scenario shows that user A sends a message to B, who then forwards it to C, who then forwards it to D, who then forwards it to E. If we process this in timeline order, one message at a time, we have:</p><ol><li><p>Timeline 1</p><ol><li><p>Does thread &#8220;111&#8221; map to a unified thread? &#10060; <br>Does rfc &#8220;1&#8221; map to a unified thread? &#10060; <br>Assign each to a new unified thread value &#8220;ut_a&#8221;, and map &#8220;ut_a&#8221; to the set of <code>{&#8220;111&#8221;}</code></p></li><li><p>Does thread &#8220;222&#8221; map to a unified thread? &#10060; <br>Does rfc &#8220;1&#8221; map to a unified thread? &#9989; = &#8220;ut_a&#8221;<br>Assign each to the existing unified thread value &#8220;ut_a&#8221; and add &#8220;222&#8221; to the &#8220;ut_a&#8221; set, which now contains <code>{&#8220;111&#8221;, &#8220;222&#8221;}</code></p></li></ol></li><li><p>Timeline 2</p><ol><li><p>Does thread &#8220;222&#8221; map to a unified thread? &#9989; = &#8220;ut_a&#8221; <br>Does rfc &#8220;2&#8221; map to a unified thread? &#10060; <br>Assign rfc &#8220;2&#8221; to the existing unified thread value &#8220;ut_a&#8221;</p></li><li><p>Does thread &#8220;333&#8221; map to a unified thread? &#10060; <br>Does rfc &#8220;2&#8221; map to a unified thread? &#9989; = &#8220;ut_a&#8221;<br>Assign thread &#8220;333&#8221; to the existing unified thread value &#8220;ut_a&#8221; and add &#8220;333&#8221; to the &#8220;ut_a&#8221; set, which now contains <code>{&#8220;111&#8221;, &#8220;222&#8221;, &#8220;333&#8221;}</code></p></li></ol></li><li><p>Timeline 3</p><ol><li><p>Does thread &#8220;333&#8221; map to a unified thread? &#9989; = &#8220;ut_a&#8221; <br>Does rfc &#8220;3&#8221; map to a unified thread? &#10060; <br>Assign rfc &#8220;3&#8221; to the existing unified thread value &#8220;ut_a&#8221;</p></li><li><p>Does thread &#8220;444&#8221; map to a unified thread? &#10060; <br>Does rfc &#8220;3&#8221; map to a unified thread? &#9989; = &#8220;ut_a&#8221;<br>Assign thread &#8220;444&#8221; to the existing unified thread value &#8220;ut_a&#8221; and add &#8220;444&#8221; to the &#8220;ut_a&#8221; set, which now contains <code>{&#8220;111&#8221;, &#8220;222&#8221;, &#8220;333&#8221;, &#8220;444&#8221;}</code></p></li></ol></li><li><p>Timeline 4</p><ol><li><p>Does thread &#8220;444&#8221; map to a unified thread? &#9989; = &#8220;ut_a&#8221; <br>Does rfc &#8220;4&#8221; map to a unified thread? &#10060; <br>Assign rfc &#8220;4&#8221; to the existing unified thread value &#8220;ut_a&#8221;</p></li><li><p>Does thread &#8220;555&#8221; map to a unified thread? &#10060; <br>Does rfc &#8220;4&#8221; map to a unified thread? &#9989; = &#8220;ut_a&#8221;<br>Assign thread &#8220;555&#8221; to the existing unified thread value &#8220;ut_a&#8221; and add &#8220;555&#8221; to the &#8220;ut_a&#8221; set, which now contains <code>{&#8220;111&#8221;, &#8220;222&#8221;, &#8220;333&#8221;, &#8220;444&#8221;, &#8220;555&#8221;}</code></p></li></ol></li></ol><p>Let&#8217;s compare this to how things would&#8217;ve worked with the breadth-first SQL query. We would start with thread &#8220;111&#8221; and find thread &#8220;222&#8221;. We repeat with thread &#8220;222&#8221; and find &#8220;333&#8221;, and so on until we eventually find &#8220;555&#8221; and a subsequent query doesn&#8217;t return any new threads.</p><p>But once we&#8217;ve written the data in the new format, let&#8217;s consider how we lookup the information. We start with thread &#8220;111&#8221; and lookup the corresponding unified thread, finding &#8220;ut_a&#8221;. We then lookup &#8220;ut_a&#8221; and find the set <code>{&#8220;111&#8221;, &#8220;222&#8221;, &#8220;333&#8221;, &#8220;444&#8221;, &#8220;555&#8221;}</code>. And that&#8217;s it. We&#8217;ve turned a recursive breadth-first SQL query into two trivial lookups.</p><h4>Handling all cases</h4><p>Remember the earlier example when we processed Alice&#8217;s email and Charlie&#8217;s email and each thread is distinct, but once Bob joins the team his emails unify the two?</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LPme!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LPme!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png 424w, https://substackcdn.com/image/fetch/$s_!LPme!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png 848w, https://substackcdn.com/image/fetch/$s_!LPme!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png 1272w, https://substackcdn.com/image/fetch/$s_!LPme!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LPme!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png" width="325" height="188" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:188,&quot;width&quot;:325,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7378,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LPme!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png 424w, https://substackcdn.com/image/fetch/$s_!LPme!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png 848w, https://substackcdn.com/image/fetch/$s_!LPme!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png 1272w, https://substackcdn.com/image/fetch/$s_!LPme!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a7fe9ad-386b-4338-bfb3-e55b1e546002_325x188.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>If we processed this using the above timeline example, Alice&#8217;s emails would be &#8220;ut_a&#8221; and Charlie&#8217;s emails would be &#8220;ut_b&#8221; since they have no emails in common. But when Bob joins, we now have an exception: message 3 in Bob&#8217;s thread maps to &#8220;ut_a&#8221; and message 5 in Bob&#8217;s thread maps to &#8220;ut_b&#8221;. How do we handle this?</p><p>It&#8217;s simple: we store unified thread merges as well. This introduces a new relationship:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;merges = \\{ (ut1, ut2) \\}&quot;,&quot;id&quot;:&quot;LOFMZQMAJV&quot;}" data-component-name="LatexBlockToDOM"></div><p>When the unified thread for a user&#8217;s email thread and the unified thread for one of the messageIds in that thread are different, we record a merge with the unordered pair of unified threads being merged. In the Alice, Bob, and Charlie example that would result in merges containing <code>{ (&#8220;ut_a&#8221;, &#8220;ut_b&#8221;) }</code>.</p><p>This adds one more step to finding all threads given a starting thread:</p><ol><li><p>lookup starting threadId to find unified thread</p></li><li><p>lookup any merges for the unified thread &#8212; if any, recurse finding more merges</p></li><li><p>from the set of all unified threads identified, lookup the threadIds</p></li></ol><p>While step 2 is recursive, in practice merging threads happens rarely and it&#8217;s even rarer that there are multiple merges. To put this in numbers, the storage needed to represent the mapping of threadId &#8594; ut, messageId &#8594; ut, and ut &#8594; {threadId} takes up 9.6 TB. The storage needed to represent all unified thread merges is only 1.4 GB, almost 4 orders of magnitude less data.</p><h3>Implementing the data storage layer</h3><p>One requirement for making this algorithm work is that when looking up and storing data, we need strong consistency guarantees. My first thought was to use Redis given that it&#8217;s single threaded and implementing this using Lua scripting provides for atomic changes. And, on a sufficiently large system, Redis can support 100K+ queries per second which is within our target of needing to process peak rates of 30K/second. And should we need more capacity, <a href="https://www.dragonflydb.io/">Dragonfly</a> claims an order of magnitude improvement over that.</p><p>However, what makes Redis fast is that the entire dataset is loaded in memory, and some back of the envelope math<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> showed that the storage requirements would range from an estimate of 9.2 TB typical to over 37 TB in the worst case scenario, and would be impractical to make this work.</p><h4>Scylla</h4><p>The next contender was <a href="https://www.scylladb.com/">Scylla</a>, a NoSQL system touted as being an improvement over Cassandra. While its storage model is eventually consistent, it also offers lightweight transactions (LWT) which provide for the strong consistency we require.</p><p>Here&#8217;s what the schema looked like:</p><pre><code>-- 1:1 mapping of (gid, thread_id) -&gt; unified thread id
create table if not exists gid_thread_to_ut (
    gid             ascii,
    thread_id       ascii,
    ut_id           ascii,&#9;&#9;-- some "$gid:$thread_id"
    primary key ((gid, thread_id))&#9;-- composite partition key with both gid and thread_id
);

-- reverse lookup of (gid, thread_id) by ut_id
create index if not exists on gid_thread_to_ut (ut_id);

-- 1:1 mapping of rfc message id -&gt; unified thread id
create table if not exists rfc_to_ut (
    md5_rfc_message_id  ascii,
    ut_id               ascii,      -- some "$gid:$thread_id"
    primary key (md5_rfc_message_id)
);

-- reverse lookup of rfc message id by ut_id
create index if not exists on rfc_to_ut (ut_id);

-- n:n mapping, merging source unified thread -&gt; target
create table if not exists ut_merge (
    source_ut_id    ascii,      -- source unified thread merged from
    target_ut_id    ascii,      -- target unified thread merged to
    primary key (source_ut_id, target_ut_id)
);

-- reverse lookup of source unified thread by target unified thread
create index if not exists on ut_merge (target_ut_id);</code></pre><p>The data insertion from Kotlin was super simple:</p><pre><code>fun setUTForGidThread(gid: String, threadId: String, defaultUT: String): String {
        val query =
            SimpleStatement.builder("INSERT INTO gid_thread_to_ut (gid, thread_id, ut_id) VALUES (?, ?, ?) IF NOT EXISTS")
                .addPositionalValues(gid, threadId, defaultUT)
                .build()

        val resultSet = scyllaSession.execute(query)
        val row = resultSet.one()
        val result =
            if (row!!.getBoolean("[applied]")) {
                defaultUT
            } else {
                row.getString("ut_id")!!
            }

        return result
    }</code></pre><p>The nice thing about Scylla&#8217;s LWT inserts is that the &#8220;if not exists&#8221; will insert the value if and only if a value doesn&#8217;t already exist and, in any case, the result of the operation either indicates that the insert was applied, in which case we know that our <code>defaultUT</code> value was inserted, or it returns the existing record in which case we fetch the <code>ut_id</code> value.</p><p>Similar for messageId:</p><pre><code>fun setUTForMessageId(rfcMessageId: String, defaultUT: String): String {
        val md5RfcMessageId = Utils.md5Hash(rfcMessageId)

        val query =
            SimpleStatement.builder("INSERT INTO rfc_to_ut (md5_rfc_message_id, ut_id) VALUES (?, ?) IF NOT EXISTS")
                .addPositionalValues(md5RfcMessageId, defaultUT)
                .build()

        val resultSet = scyllaSession.execute(query)
        val row = resultSet.one()
        val result =
            if (row!!.getBoolean("[applied]")) {
                defaultUT
            } else {
                row.getString("ut_id")!!
            }

        return result
    }</code></pre><p>And finally to record a merge:</p><pre><code>fun setUTMerge(sourceUT: String, targetUT: String) {
        val querySource =
            SimpleStatement.builder("INSERT INTO ut_merge (source_ut_id, target_ut_id) VALUES (?, ?) IF NOT EXISTS")
                .addPositionalValues(sourceUT, targetUT)
                .build()
        scyllaSession.execute(querySource)
    }</code></pre><p>Whenever we process an email for a user, we know the user&#8217;s gid as well as the threadId. From this, we assume that if a unified thread doesn&#8217;t already exist, we&#8217;ll use the current thread, which we construct using a string template:</p><pre><code>val defaultUT = "$gid:$threadId"</code></pre><p>So the core algorithm, which I dubbed &#8220;threadweaver&#8221;, is super simple:</p><pre><code>val existingGidThreadUt = getUnifiedThreadForGidThreadId(gid, threadId)

when (existingGidThreadUt) {
    null -&gt; {
        // thread isn't yet associated with a unified thread
        val defaultUT = "$gid:$threadId"
        val utFromMessageId = setUTForMessageId(rfcMessageId, defaultUT)
        val utFromGidThread = setUTForGidThread(gid, thread, utFromMessageId)
        if (utFromGidThread != utFromMessageId) {
            setUTMerge(utFromGidThread, utFromMessageId)
        }
    }
    else -&gt; {
        // thread already associated with a unified thread
        val utFromMessageId = setUTForMessageId(rfcMessageId, existingGidThreadUt)
        if (existingGidThreadUt != utFromMessageId) {
            setUTMerge(existingGidThreadUt, utFromMessageId)
        }        
    }
}</code></pre><p>This is trivial to perform on every incoming email we process. We either assign a unified thread, or retrieve the one that already exists for both threadId and messageId. If the two differ, record a merge. </p><p>This worked great and comprehensive tests showed that it was functioning correctly, but in practice we ran into a lot of problems with Scylla operationally. Since the only way to guarantee strong consistency is the use of LWT, this made maintenance problematic. While we never modified any data once inserted, part of maintaining a Scylla cluster involves compaction and repair. However, our exclusive use of LWT to the tune of thousands of writes per second appeared to hinder any attempts to perform maintenance. Compaction and repairs never appeared to complete successfully. What should have been simple and routine turned out to be impossible. There&#8217;s a comment<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> on Hacker News which, unfortunately, echoes my experience.</p><p>Given these operational issues, we couldn&#8217;t trust our data to Scylla. A shame as the simplicity of &#8220;insert &#8230; if not exists&#8221; made for incredibly easy to understand writes that matched the algorithm beautifully.</p><h4>Bigtable</h4><p>We had used Google&#8217;s <a href="https://cloud.google.com/bigtable">Bigtable</a> extensively in the past. We moved off of it as the workload was more suited to a relational database, but when used for its strengths, it performed exceptionally well. So much so that it earned an unofficial nickname of &#8220;Honey Badger&#8221; because, like that famous YouTube video<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>, it handled data effortlessly. Need to fetch thousands of rows? &#8220;Honey badger don&#8217;t care!&#8221; How about millions of rows? &#8220;Honey badger don&#8217;t care!&#8221; Do your rows have thousands of columns? You guessed it: &#8220;Honey badger don&#8217;t care!&#8221; Bigtable&#8217;s throughput is insanely fast.</p><p>Key to making this work is figuring out how to take advantage of strong consistency. Fortunately, by default a single Bigtable cluster provides strong consistency<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a> on single row operations which is sufficient. To implement the equivalent of &#8220;insert &#8230; if not exists&#8221; for Bigtable, we can take advantage of <a href="https://cloud.google.com/bigtable/docs/samples/bigtable-writes-conditional">conditional writes</a> (or &#8220;mutations&#8221; in Bigtable&#8217;s parlance). </p><p>Here&#8217;s what the Kotlin code looks like:</p><pre><code>val filter =
    FILTERS.chain()
        .filter(FILTERS.family().exactMatch("gt"))
        .filter(FILTERS.qualifier().regex(".+"))
        .filter(FILTERS.value().regex(".+"))

// only perform the mutation if the filter doesn't match
// (ie: no matching qualifiers/values)
val mutation = Mutation.create()
    .setCell("gt", defaultUT, timestamp, "_")
val conditionalMutation = ConditionalRowMutation
    .create("gid-thread-to-ut", rowKey)
    .condition(filter)
    .otherwise(mutation)

client.checkAndMutateRow(conditionalMutation)</code></pre><p> I&#8217;ll translate what this is doing:</p><ul><li><p>a conditional mutation evaluates the filter provided by <code>.condition</code> which, if true, runs the mutation provided by <code>.then</code> (not used here) or, if false, runs the mutation provided by <code>.otherwise</code></p></li><li><p>the filter used by the condition checks to see if the &#8220;gt&#8221; column family (a distinct group of columns) contains a qualifier (column name) that contains at least one character having a value that contains at least one character.</p></li><li><p>the <code>.otherwise</code> mutation operates on the table &#8220;gt-thread-to-ut&#8221; for the provided <code>rowKey</code>, setting the cell value to the provided <code>defaultUT</code> value.</p></li></ul><p>The result is that nothing happens if there&#8217;s any value already, since an existing value means that this thread is already mapped to a unified thread and, once mapped, we never change it. But if there&#8217;s no value, then the mutation executes and we&#8217;ve atomically set the mapping from that thread to the unified thread. Because single row operations have strong consistency, there&#8217;s no possibility of a race condition here.</p><p>The same type of operations are performed for conditionally setting the messageId to unified thread mapping as well as any unified thread merges.</p><p>The one difference from Scylla is that this operation doesn&#8217;t return whether the mutation took place or if there was already an existing value. So we simply query for the rowKey again to find the unified thread, whether it was the one we just inserted (mutated) or whether it was already there. But those are implementation details; the core threadweaver algorithm remains the same:</p><pre><code>val existingGidThreadUt = getUnifiedThreadForGidThreadId(gid, threadId)

when (existingGidThreadUt) {
    null -&gt; {
        // thread isn't yet associated with a unified thread
        val defaultUT = "$gid:$threadId"
        val utFromMessageId = setUTForMessageId(rfcMessageId, defaultUT)
        val utFromGidThread = setUTForGidThread(gid, thread, utFromMessageId)
        if (utFromGidThread != utFromMessageId) {
            setUTMerge(utFromGidThread, utFromMessageId)
        }
    }
    else -&gt; {
        // thread already associated with a unified thread
        val utFromMessageId = setUTForMessageId(rfcMessageId, existingGidThreadUt)
        if (existingGidThreadUt != utFromMessageId) {
            setUTMerge(existingGidThreadUt, utFromMessageId)
        }        
    }
}</code></pre><h4>Performance</h4><p>The honey badger of databases doesn&#8217;t disappoint. We routinely read anywhere from 50,000 rows per second to almost 250,000 rows per second representing roughly 64 MB/s to 256 MB/s of data transmitted.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NYGv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NYGv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png 424w, https://substackcdn.com/image/fetch/$s_!NYGv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png 848w, https://substackcdn.com/image/fetch/$s_!NYGv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png 1272w, https://substackcdn.com/image/fetch/$s_!NYGv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NYGv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png" width="759" height="278" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3ad8252-5a69-4758-99c4-a9ec47f10f2e_759x278.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:278,&quot;width&quot;:759,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33060,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ad8252-5a69-4758-99c4-a9ec47f10f2e_759x278.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NYGv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png 424w, https://substackcdn.com/image/fetch/$s_!NYGv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png 848w, https://substackcdn.com/image/fetch/$s_!NYGv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png 1272w, https://substackcdn.com/image/fetch/$s_!NYGv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17dbaba-f4ed-4dbf-8178-39279eedd6b8_759x278.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Bigtable read performance</figcaption></figure></div><p>Similarly, Bigtable easily handles writes:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5JfF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5JfF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png 424w, https://substackcdn.com/image/fetch/$s_!5JfF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png 848w, https://substackcdn.com/image/fetch/$s_!5JfF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png 1272w, https://substackcdn.com/image/fetch/$s_!5JfF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5JfF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png" width="762" height="275" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea4d52e6-acc5-4563-a3df-e159d9b07026_762x275.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:275,&quot;width&quot;:762,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35796,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea4d52e6-acc5-4563-a3df-e159d9b07026_762x275.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5JfF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png 424w, https://substackcdn.com/image/fetch/$s_!5JfF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png 848w, https://substackcdn.com/image/fetch/$s_!5JfF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png 1272w, https://substackcdn.com/image/fetch/$s_!5JfF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4dd29fa-d91c-4c73-91cd-4d5dd5868442_762x275.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Bigtable write performance</figcaption></figure></div><p>Bigtable also simplifies management. While Scylla would&#8217;ve required continuous maintenance with compacting and repairing data plus working around things like VM outages, Bigtable is a managed service. We set the backup schedule, specify how much storage we want per node, set the target CPU% to trigger autoscaling, and Google handles everything. It&#8217;s a completely hands-off experience.</p><h3>Every email, everywhere, all at once</h3><p>Astute readers may have noticed that this unification algorithm doesn&#8217;t have any concept of what team the email we are processing is for. The only inputs under consideration are what thread a message is in and its Message-ID header value. It will happily unify threads for anyone who uses Streak if they were sent the same email having the same Message-ID header value. The result is that every email we process is unified across Streak&#8217;s entire userbase. And the simple write-time algorithm has constant O(1) performance.</p><p>We restrict emails by team trivially by filtering the results by the list of <code>gid</code> values for our team members. We resolve a starting thread to a set of one or more unified threads, each of which is a self-contained row. It&#8217;s then a simple matter to filter the cells<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a> in the row to those that match our list of <code>gid</code> values.</p><p>Once retrieved, we then need to apply individual user permissions based on what their sharing settings are. To do this, we no longer use JGraph as that was overkill, but rather apply the permissions directly on top of the efficiently retrieved data.</p><h3>User reactions</h3><p>This log scale graph shows overall performance when fetching the timeline for a box along with all unified threads, showing the same performance improvement as the graphs at the top of the post. The very right side of the graph is from when the change was rolled out to all users:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!b3qC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!b3qC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png 424w, https://substackcdn.com/image/fetch/$s_!b3qC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png 848w, https://substackcdn.com/image/fetch/$s_!b3qC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png 1272w, https://substackcdn.com/image/fetch/$s_!b3qC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!b3qC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png" width="1373" height="625" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64b077bc-12b7-4736-bf8f-ae644265ac90_1373x625.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:625,&quot;width&quot;:1373,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54993,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://engineering.streak.com/i/147685966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b077bc-12b7-4736-bf8f-ae644265ac90_1373x625.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!b3qC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png 424w, https://substackcdn.com/image/fetch/$s_!b3qC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png 848w, https://substackcdn.com/image/fetch/$s_!b3qC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png 1272w, https://substackcdn.com/image/fetch/$s_!b3qC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dfbcb0d-1366-4b4b-8ac8-456c1d9fb8dc_1373x625.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Timeline fetching right after launch</figcaption></figure></div><p>As mentioned at the beginning, this went from &#8220;I&#8217;m gonna grab a coffee&#8221; levels of performance to &#8220;I blinked and my data loaded&#8221;. Some happy expletives were shared. In addition to instant timeline retrieval, multiple other areas of Streak saw performance gains:</p><ul><li><p>when viewing 50 or 100 threads in your Gmail inbox, identifying which of those are associated with boxed threads is much, much faster</p></li><li><p>processing autoboxed emails is not only faster but also fully correct, even for the edge cases where the user who boxed a thread no longer participates</p></li><li><p>quicker retrieval of email threads/envelopes your team has shared with you</p></li></ul><h3>Infrastructure changes</h3><p>Because of the performance gains from the threadweaver algorithm, we no longer needed as many VMs for our TiDB cluster, so we scaled that down as both our storage requirements and query volume dropped dramatically. Additionally, our API servers were no longer tied up waiting for the results of unification, so we ended up needing fewer pods for our API deployment in Kubernetes<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a>. In fact, we used to maintain a separate &#8220;threads&#8221; API deployment because of the slow performance so that a backlog of thread-related requests wouldn&#8217;t impact our regular API traffic. But with threadweaver, performance is little different from any other endpoint, so those API endpoints are now handled by our main deployment.</p><h3>Engineering at Streak</h3><p>We work on many interesting challenges affecting billions of requests daily involving many terabytes of data. For more information, visit <a href="https://www.streak.com/careers">https://www.streak.com/careers</a></p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://engineering.streak.com/p/cutting-95th-percentile-latency-from/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://engineering.streak.com/p/cutting-95th-percentile-latency-from/comments"><span>Leave a comment</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://engineering.streak.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://engineering.streak.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>In practice, Message-ID values aren&#8217;t always unique. There are exceptions from systems that aren&#8217;t RFC compliant where every email the system generates has some fixed <code>Message-ID</code> header, and even some systems which don&#8217;t generate a <code>Message-ID</code> header at all. Exceptions aside, the header value should look like <code>&lt;someuniqueid@domain&gt;</code>. You can read more about recommendations <a href="https://www.jwz.org/doc/mid.html">here</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>We interchangeably refer to the Message-ID header value as &#8220;rfc&#8221; since this is from RFC 822: <a href="https://www.rfc-editor.org/rfc/rfc822.html#section-4.6.1">https://www.rfc-editor.org/rfc/rfc822.html#section-4.6.1</a> and Gmail also has its own internal &#8220;messageId&#8221; value, which is unused for the purpose of unification.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Based on the Redis docs on memory footprint: <a href="https://redis.io/docs/latest/develop/get-started/faq/#whats-the-redis-memory-footprint">https://redis.io/docs/get-started/faq/</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p><a href="https://news.ycombinator.com/item?id=25523851">https://news.ycombinator.com/item?id=25523851</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p><a href="https://youtu.be/4r7wHMg5Yjg">Honey Badger on YouTube</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>See <a href="https://cloud.google.com/bigtable/docs/overview#consistency">https://cloud.google.com/bigtable/docs/overview#consistency</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>I mentioned earlier how some Message-ID values aren&#8217;t unique and get reused for every email. We handle this by limiting the number of user threads fetched for a given unified thread. If it&#8217;s more than 10,000 then we simply ignore it as those emails are likely to be spam.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>These autoscale using a Horizontal Pod Autoscaler. See <strong><a href="https://engineering.streak.com/p/implementing-bluegreen-deployments">Implementing blue/green deployments using Kubernetes and Envoy</a></strong> for more info.</p></div></div>]]></content:encoded></item><item><title><![CDATA[Implementing blue/green deployments using Kubernetes and Envoy]]></title><description><![CDATA[Maximizing system uptime and reducing risk via advanced deployment strategies]]></description><link>https://engineering.streak.com/p/implementing-bluegreen-deployments</link><guid isPermaLink="false">https://engineering.streak.com/p/implementing-bluegreen-deployments</guid><dc:creator><![CDATA[Blake Kadatz]]></dc:creator><pubDate>Tue, 13 Aug 2024 20:19:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ECHz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Streak&#8217;s infrastructure started out on Google&#8217;s AppEngine and served us well for many years. Before I joined the company, Streak had outgrown what AppEngine could provide and made the move to the full GCP platform with our infrastructure hosted on GKE (Google Kubernetes Engine).</p><p>Deployment was fairly simple: we push our main branch to GitHub, CircleCI picks up the change, builds and tests everything, then pushes the docker image to GCR (Google Container Registry). Our <code>deployment.yaml</code> file had some lines like this:</p><pre><code>      containers:
        - image: gcr.io/ourproject/imagename:abcd123</code></pre><p>Here, <code>abcd123</code> is the short commit hash for the new image. </p><h3>The old way</h3><p>To deploy to production, a developer would edit this file, change the commit hash to the latest, and run</p><pre><code>kubectl apply -f deployment.yaml</code></pre><p>This triggered a process that was largely hands off. Kubernetes picks up the change and starts transforming the deployment into the new desired state. At the time, the deployment ran at a fixed size of 150 pods. By modifying the commit hash, Kubernetes applies what&#8217;s known as a &#8220;rolling update&#8221;, which is governed by a few strategies. At the time we had:</p><pre><code>  strategy:
    rollingUpdate:
      maxSurge: 10
      maxUnavailable: 10</code></pre><p>Since we&#8217;re not increasing the size of the deployment, but rather replacing pods in the fixed size deployment, the <code>maxUnavailable</code> setting is most relevant. 10 pods begin the termination process. This isn&#8217;t instant; our deployment handles both API requests that complete as quickly as possible as well as processing asynchronous tasks. The vast majority of those tasks are handled as quickly as API requests, but a small percentage are longer running when they need to process large volumes of data. We want to give those time to complete, but still force a shutdown if it takes longer than necessary. That is handled by another setting:</p><pre><code>      terminationGracePeriodSeconds: 700</code></pre><p>This allows <a href="https://www.eclipse.org/jetty/">Jetty</a> to gracefully shut down if it&#8217;s able, otherwise Kubernetes will forcefully terminate the pod after 700 seconds (just under 12 minutes) which is more than enough for even the longest tasks<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. Our API sits behind Google Cloud Load Balancer which performs SSL termination and routes traffic to the service associated with the deployment. The service keeps track of which pods are ready via the readiness probe, so live traffic is only routed to pods which are online and serving.</p><p>The end result was 150 pods shutting down 10 at a time which, in the absolute worst case scenario, takes the full 700 seconds each before being replaced with 10 new pods, resulting in a new deploy taking up to 3 hours to go out. In practice, the worst case never occurred and most deploys were in the 20 to 40 minute range.</p><p>Typically this wasn&#8217;t an issue. We could still deploy many times throughout the day. The downside is that if you had a critical bug that you needed to fix, it would take much longer for it to reach all pods unless you resorted to force terminating the pods, which should be a last resort as you are also terminating active user requests and background tasks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ECHz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ECHz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ECHz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ECHz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ECHz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ECHz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg" width="1000" height="334" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:334,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45691,&quot;alt&quot;:&quot;blue and green server racks&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="blue and green server racks" title="blue and green server racks" srcset="https://substackcdn.com/image/fetch/$s_!ECHz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ECHz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ECHz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ECHz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb502390a-739e-4025-8dbf-3c7f0e3a37c1_1000x334.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Blue-green deployments</h3><p>The concept behind the blue-green deployment strategy is that you have your existing deployment (the &#8220;blue&#8221; one) and you want to release a new deployment (the &#8220;green&#8221; one). Rather than doing a replacement of your running code, you can temporarily run each one in parallel. This allows you to:</p><ul><li><p>quickly scale up the new deployment without being dependent on simultaneously scaling down the old deployment</p></li><li><p>keep both around for a while in case there&#8217;s a critical issue in the new deployment so you can swap back</p></li><li><p>increase flexibility in what you deploy since once you have the infrastructure for splitting traffic between two deployments, you can choose to split traffic in additional ways</p></li></ul><p>The last point on splitting traffic is where you can do some interesting things.</p><h3>Design goals</h3><p>We decided to go with <a href="https://www.envoyproxy.io/">Envoy</a> sitting between Google Cloud Load Balancer and our service. Envoy is a proxy originally created by Lyft and counts among its users Amazon AWS, Google, Netflix, Stripe, and many others. It&#8217;s what allows us to control routing of traffic at a more granular level than what we had before. We had a number of goals in the new design:</p><ul><li><p>Speed up the deploy process from approximately 30 minutes to as fast as possible</p></li><li><p>Have the ability to quickly rollback a bad deployment</p></li><li><p>Support the use of canary deployments. Like the proverbial canary in the coalmine, a canary deployment allows you to test changes to a small subset of users before rolling out to everyone. There were a few types we wanted:</p><ul><li><p>Deploy to some % of our traffic. For example, 1% or 10%. This lets us do things like make performance tweaks and measure how the new code performs in production.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p></li><li><p>Deploy to specific customers by their domain name.</p></li><li><p>Deploy to specific users by email address.</p></li><li><p>Deploy standalone canaries for independent testing.</p></li></ul></li><li><p>All the above must work correctly with no dropped requests</p></li></ul><p>Additionally, it was a good opportunity to explore autoscaling. Rather than a fixed number of nodes having a fixed cost, we&#8217;d like the deployment to scale automatically based on load. During peak hours, the deployment should scale to as many pods as is required to handle traffic. For off-peak hours, there&#8217;s no sense paying for pods which are sitting idle, so the deployment should scale down to a reasonable minimum.</p><p>The scaling targets are based on two metrics:</p><ol><li><p>CPU utilization. Most of our workload is I/O intensive, involving fetching from some cache, datastore, or remote API, then doing permission checks, performing minimal transformations of the data, then responding with the data. Our servers don&#8217;t deal with HTTPS encryption and decryption (that&#8217;s handled for us by Google Cloud Load Balancer), but for the occasional request that does consume significant CPU resources we want to ensure we have some headroom.</p></li><li><p>Thread utilization. We&#8217;ve configured each pod in our deployment with 40 threads for handling requests. When all 40 threads are busy, additional requests get added to a queue until a thread becomes free. Each request may spawn its own threads to do parallel I/O for example, but those are independent.</p></li></ol><p>We set a target of 50% for each. In practice, we almost never see high CPU utilization for the reasons mentioned above. The goal with thread utilization is that by running at 50% capacity, we are able to quickly handle spikes up to double our current request volume before the new capacity goes online.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><p>For some quick back of the envelope math, if we&#8217;re running a fixed 150 pods with each pod having 40 threads for responding to requests, that&#8217;s 6000 simultaneous threads. If a typical API request takes 100ms to complete, we can handle 60,000 requests per second before we would need to scale up.</p><h3>Horizontal Pod Autoscaler</h3><p>As with non-cloud server architecture, you can scale any given service both horizontally and vertically. Vertical scaling means that if you&#8217;ve hit capacity while running on a fixed number of servers with 2 CPU and 8GB of RAM, you can upgrade the specs of those same number of servers to 4 CPU and 16GB of RAM. If that&#8217;s not enough, you move to 8 CPU / 32 GB and so on.</p><p>Horizontal scaling is where you hit capacity and you just keep adding more servers with the same specs as before. Kubernetes does this via the Horizontal Pod Autoscaler (HPA). The way it works is via a YAML file describing how you want Kubernetes to scale a given resource, such as a deployment. The important bits are:</p><ul><li><p><code>minReplicas</code> &#8212; for a system serving up production traffic, you never want to scale down to a single replica, so this is where you set your reasonable minimum that is capable of handling the transition from off-peak to regular traffic.</p></li><li><p><code>maxReplicas</code> &#8212; on the opposite end, you want to control your costs and be aware of any issues that are causing unexpectedly high utilization. Setting the maximum number ensures that you don&#8217;t scale beyond this number.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a></p></li><li><p><code>scaleTargetRef</code> &#8212; this is where you specify what thing the HPA needs to scale up or down. In our case, this is <code>kind: Deployment</code> and <code>name: &lt;deployment name&gt;</code></p></li><li><p><code>metrics</code> &#8212; here you specify your metric(s) that Kubernetes monitors as well as the target type and value.</p></li></ul><p>For our thread metric, we set the target at <code>500m</code>, which can be read as &#8220;500 milli&#8221; and is equivalent to a value of <code>0.5</code> or 50%. The official Kubernetes site <a href="https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/">has a great writeup</a> of how it works, the scaling algorithm, and more.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CsDR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CsDR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png 424w, https://substackcdn.com/image/fetch/$s_!CsDR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png 848w, https://substackcdn.com/image/fetch/$s_!CsDR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png 1272w, https://substackcdn.com/image/fetch/$s_!CsDR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CsDR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png" width="1456" height="471" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:471,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49908,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CsDR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png 424w, https://substackcdn.com/image/fetch/$s_!CsDR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png 848w, https://substackcdn.com/image/fetch/$s_!CsDR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png 1272w, https://substackcdn.com/image/fetch/$s_!CsDR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52432ac-c6bb-4481-bdce-5629bec3c99a_2173x703.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>Setting up Envoy</h3><p>The simple description of how Envoy works is that it operates on clusters and routes. Think of a cluster as essentially where you want to send requests to. A route then describes how to specify which request goes to which cluster. In our case, the cluster identifies that we want to send API traffic to our Kubernetes deployment and the route identifies which paths on the API to send. The simplest route is to simply send everything to one single cluster:</p><pre><code>    routes:
    - match:
        prefix: "/"
      route:
        cluster: api</code></pre><p>However, for blue/green deployments we have the concept of a primary API and a secondary API. The primary represents the current, active (blue) deployment and the secondary is then the new (green) deployment. Envoy has a concept of a weighted cluster which looks like this:</p><pre><code>    routes:
    - match:
        prefix: "/"
      route:
        weighted_clusters:
          clusters:
            - name: secondaryapi
              weight: 10
            - name: primaryapi
              weight: 90
</code></pre><p>This will direct 10% of traffic to the new deployment while directing 90% of traffic to the old deployment. As the new deployment comes online and more and more pods are ready, we can gradually increase the weight of the secondary cluster and reduce the weight of the primary cluster.</p><p>But how do we know when the new deployment is ready to serve traffic? And how do we correctly update Envoy&#8217;s configuration?</p><h4>Weighing traffic</h4><p>Kubernetes provides a nice way to get the current status of a deployment. Let&#8217;s say we just deployed our API with a target of 80 replicas. Running:</p><pre><code>kubectl get deployment api-deployment -o yaml</code></pre><p>Returns output which includes some status information:</p><pre><code>status:
  availableReplicas: 19</code></pre><p>Given that our example target is 80 replicas, we know that 19/80 or 23.75% are ready to serve traffic. We round this down to the nearest integer, 23, and use that as the weight for the secondary API. Since Envoy&#8217;s weighted clusters must add to 100, the weight of the primary API is then 77. We then update our Envoy configuration with these values and the route looks like:</p><pre><code>    routes:
    - match:
        prefix: "/"
      route:
        weighted_clusters:
          clusters:
            - name: secondaryapi
              weight: 23
            - name: primaryapi
              weight: 77
</code></pre><p>Envoy supports updating its runtime via a symbolic link swap, which is the only way to atomically modify a file. To support this, we store all our Envoy configuration files within a single Kubernetes configmap file and it&#8217;s trivial to overwrite the configmap with the new contents. To convert this into a link swap in the Envoy deployment, we make use of a sidecar container running <a href="https://github.com/mumoshu/crossover">Crossover</a> which watches for changes in the configmap and, when it detects changes, it reads the contents of the configmap and outputs the new file contents to the filesystem, then performs a link swap on each. Envoy listens for link swaps and then applies the changes to its runtime environment.</p><p>By checking the status of the new deployment in a loop every few seconds, our deploy script keeps updating the deployment percentage as new pods become ready to serve traffic, and Kubernetes and Envoy handle the rest. Once we reach 100%, we promote the secondary to primary, and the deploy is done.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a></p><p>This allows us to quickly roll out a new full deployment where the end goal is the new deployment having 100% of the traffic as well as deploying to some developer-specified percent of traffic, where we update Envoy until we reach the target percent.</p><h4>Routing by email or domain</h4><p>In addition, we can route requests based on the user&#8217;s email address or domain name. This is accomplished via changing how routes match. For normal requests, our UI appends <code>?email=user@example.com</code> to API queries. Additionally, requests that end up creating asynchronous tasks append a header value <code>X-Streak-Request-Email</code> with the email address. We can easily match both of these with the following match definitions:</p><pre><code>    - match:
        prefix: "/"
        query_parameters:
          name: "email"
          string_match:
            safe_regex:
              regex: {{DOMAIN_OR_EMAIL_REGEX}}
      route:
        # (route definition omitted)
    - match:
        prefix: "/"
        headers:
          name: "X-Streak-Request-Email"
          safe_regex_match:
            regex: {{DOMAIN_OR_EMAIL_REGEX}}
      route:
        # (route definition omitted)</code></pre><p>This is a template to which we perform variable substitution, replacing <code>{{DOMAIN_OR_EMAIL_REGEX}}</code> with the appropriate email address or domain regex.</p><h3>Rolling back a bad deploy</h3><p>Another of our design goals was to quickly roll back a bad deployment. Our deploy script supports a <code>swap</code> command. Implementing this is easy. We first check that the old deployment hasn&#8217;t scaled down too far and, if it hasn&#8217;t, we simply update our Envoy cluster definition and switch the endpoint for the primaryapi and secondaryapi. Each cluster&#8217;s definition looks something like this simplified example:</p><pre><code>resources:
- "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
  name: primaryapi
  type: STRICT_DNS
  dns_lookup_family: V4_ONLY
  lb_policy: LEAST_REQUEST
  load_assignment:
    cluster_name: primaryapi
    endpoints:
    - lb_endpoints:
        endpoint:
          address:
            socket_address:
              address: {{PRIMARY_CLUSTERIP_SERVICE}}.default.svc.cluster.local</code></pre><p>Again, we do templated variable substitution here and update the Kubernetes configmap, which gets picked up by Crossover, which performs a link swap, which gets picked up by Envoy and we&#8217;ve swapped back in seconds.</p><h3>Deploy script</h3><p>We can perform deploys using a simple command line script. Here are the current suite of deploy options the deploy command supports:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z5jS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z5jS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png 424w, https://substackcdn.com/image/fetch/$s_!z5jS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png 848w, https://substackcdn.com/image/fetch/$s_!z5jS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png 1272w, https://substackcdn.com/image/fetch/$s_!z5jS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z5jS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png" width="657" height="852" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:852,&quot;width&quot;:657,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:143168,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z5jS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png 424w, https://substackcdn.com/image/fetch/$s_!z5jS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png 848w, https://substackcdn.com/image/fetch/$s_!z5jS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png 1272w, https://substackcdn.com/image/fetch/$s_!z5jS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6a2cd1-f63b-4623-8be7-1e48a980eff9_657x852.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In addition, we can also cancel, check the status of a deployment, or swap back to the previous deploy via other commands.</p><h3>Testing</h3><p>While this all sounds good in theory, with all these moving parts how can we verify that Google Cloud Load Balancer&#8217;s interaction with Envoy and its routing of traffic to a Kubernetes service, which then hits our API deployment is functioning correctly given that we&#8217;re also introducing Horizontal Pod Autoscaling into the mix?</p><h4>Enter Tsung</h4><p><a href="http://tsung.erlang-projects.org/">Tsung</a> is Erlang-based software designed for load testing. To use it, you provide a configuration file describing the traffic you want to send. In my testing, I used a simple config:</p><pre><code>&lt;tsung loglevel="info" dumptraffic="protocol" &gt;
  &lt;clients&gt;
    &lt;client host="localhost" use_controller_vm="true" maxusers="1000000" /&gt;
  &lt;/clients&gt;
  &lt;servers&gt;
    &lt;server host="api.streak.com" port="443" type="ssl" weight="100" /&gt;
  &lt;/servers&gt;
  &lt;load&gt;
    &lt;arrivalphase phase="1" duration="1" unit="minute"&gt;
      &lt;users arrivalrate="5" unit="second" /&gt;
    &lt;/arrivalphase&gt;
    &lt;arrivalphase phase="2" duration="1" unit="minute"&gt;
      &lt;users arrivalrate="10" unit="second" /&gt;
    &lt;/arrivalphase&gt;
    &lt;arrivalphase phase="3" duration="1" unit="minute"&gt;
      &lt;users arrivalrate="15" unit="second" /&gt;
    &lt;/arrivalphase&gt;
    &lt;arrivalphase phase="4" duration="1" unit="minute"&gt;
      &lt;users arrivalrate="20" unit="second" /&gt;
    &lt;/arrivalphase&gt;
    &lt;arrivalphase phase="5" duration="1" unit="minute"&gt;
      &lt;users arrivalrate="25" unit="second" /&gt;
    &lt;/arrivalphase&gt;
    &lt;arrivalphase phase="6" duration="30" unit="minute"&gt;
      &lt;users arrivalrate="30" unit="second" /&gt;
    &lt;/arrivalphase&gt;
  &lt;/load&gt;
  &lt;sessions&gt;
    &lt;session name="consume-thread" probability="80" type="ts_http"&gt;
      &lt;request&gt;
        &lt;http url="/path/to/threadSleep" method="GET" version="1.1"&gt;
          &lt;www_authenticate userid="YOUR_API_KEY_HERE" passwd="" /&gt;
        &lt;/http&gt;
      &lt;/request&gt;
    &lt;/session&gt;
    &lt;session name="hello-world" probability="20" type="ts_http"&gt;
      &lt;request&gt;
        &lt;http url="/path/to/hello" method="GET" version="1.1"&gt;
          &lt;www_authenticate userid="YOUR_API_KEY_HERE" passwd="" /&gt;
        &lt;/http&gt;
      &lt;/request&gt;
    &lt;/session&gt;
  &lt;/sessions&gt;
&lt;/tsung&gt;
</code></pre><p>The &#8220;load&#8221; section describes a timeline of traffic to be sent:</p><ul><li><p>initial ramp-up phase sending 5 new users per second for 1 minute</p></li><li><p>increasing to 10 per second for 1 minute, then 15 per second, 20 per second, 25 per second</p></li><li><p>then sustaining at 30 new requests per second for 30 minutes</p></li></ul><p>The &#8220;sessions&#8221; section defines what kind of traffic to send:</p><ul><li><p>80% of requests are to a &#8220;threadSleep&#8221; endpoint which ties up the request for 5 seconds</p></li><li><p>20% of requests are to a &#8220;hello&#8221; endpoint which returns a simple &#8220;hello world&#8221; response</p></li></ul><p>As the majority of requests consume the thread for 5 seconds and we are sending many new users per second, this nicely simulates a gradual increase in traffic which lets the HPA scale up the deployment so that, over time, we are serving hundreds of simultaneous users.</p><p>A successful test of Envoy, the new deployment system, and Horizontal Pod Autoscaling will mean that out of all the requests we send, 100% of the requests will result in a successful response with zero timeouts and zero dropped requests. </p><h4>Results</h4><p>I&#8217;ll quote from the report I produced after implementing this:</p><div class="pullquote"><p>The combination of redeployment plus horizontal pod autoscaling has been shown to be successful. This was accomplished by using the tool Tsung to direct traffic at the test Envoy instance to ramp up load prior to performing a new deployment, gradually cutting over traffic from the old to the new deployment as pods initialized and became available, then switching entirely to the new deployment.</p></div><p>Highlights:</p><ul><li><p>20,865 requests sent (80% taking 5 seconds, 20% returning &#8220;hello world&#8221; immediately)</p></li><li><p>100% of requests succeeded with a 200 OK HTTP status code. Zero server failures scaling up, during cutover, or after cutover.</p></li><li><p>The only problems were errors connecting to the API which was limited to 10 connections (out of 20K+, representing 0.05% of requests attempted) and was likely due to being on a wifi connection from my laptop.</p></li></ul><p>Some Tsung-generated graphs in the report shows that as it scaled up the number of user requests over time, the API responses kept pace when scaling up and during cutover:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ybh-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ybh-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png 424w, https://substackcdn.com/image/fetch/$s_!ybh-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png 848w, https://substackcdn.com/image/fetch/$s_!ybh-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png 1272w, https://substackcdn.com/image/fetch/$s_!ybh-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ybh-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png" width="1238" height="842" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:842,&quot;width&quot;:1238,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:199178,&quot;alt&quot;:&quot;Tsung traffic over time&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Tsung traffic over time" title="Tsung traffic over time" srcset="https://substackcdn.com/image/fetch/$s_!ybh-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png 424w, https://substackcdn.com/image/fetch/$s_!ybh-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png 848w, https://substackcdn.com/image/fetch/$s_!ybh-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png 1272w, https://substackcdn.com/image/fetch/$s_!ybh-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F210147bc-304e-428f-a6e0-bef6b59ce3b1_1238x842.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Conclusion</h3><p>The change in how we deploy our API has been very successful as demonstrated by comprehensive testing as well as the thousands of deployments we&#8217;ve done since this system has been introduced. Deployment times were reduced from an average of 30 minutes to now taking roughly 3 minutes. Additionally, we now support a number of new blue/green strategies that give us additional flexibility when we want to test out changes in production for a single user, entire team, or for some chosen percentage of all traffic.</p><p>Lastly, because our API now scales up and down dynamically via the use of Horizontal Pod Autoscaler, our GCP bill has benefited from the lower utilization during off-peak hours:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!knSK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!knSK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png 424w, https://substackcdn.com/image/fetch/$s_!knSK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png 848w, https://substackcdn.com/image/fetch/$s_!knSK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png 1272w, https://substackcdn.com/image/fetch/$s_!knSK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!knSK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png" width="669" height="193" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:193,&quot;width&quot;:669,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20067,&quot;alt&quot;:&quot;graph showing variable CPU core use&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="graph showing variable CPU core use" title="graph showing variable CPU core use" srcset="https://substackcdn.com/image/fetch/$s_!knSK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png 424w, https://substackcdn.com/image/fetch/$s_!knSK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png 848w, https://substackcdn.com/image/fetch/$s_!knSK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png 1272w, https://substackcdn.com/image/fetch/$s_!knSK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ba7b0f-7125-4a62-94f7-7167b1c1a6db_669x193.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://engineering.streak.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Streak Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3>Engineering at&nbsp;Streak</h3><p>We work on many interesting challenges affecting billions of requests daily involving many terabytes of data. For more information, visit <a href="https://www.streak.com/careers">https://www.streak.com/careers</a></p><p></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>It&#8217;s possible that if you&#8217;re processing millions of items (such as a data migration), some task could take longer. However, where this is possible we chunk those into smaller batches and either run them in parallel or re-enqueue the task to handle the next chunk.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>We also have a comprehensive experiment system that allows us to run different code paths based on whether the user or entire account is on the experiment, which we can also set as percentage based. This is usually the preferred method as it outlives any one deployment, but sometimes you&#8217;re almost certain a change is an improvement so this is another tool in the toolbox for when that makes sense.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>In practice, scaling is fast but not instant. Kubernetes needs to bring new nodes online, then spin up any DaemonSets for each node, then bring the desired pods online, plus any initialization that occurs on the pod itself.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>If you do hit the maximum, it&#8217;s important to have good monitoring and alerting in place so you can take action. This is a blog post of its own.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>There are a few other things we do to clean up deployments. We update the HPA on the old deployment to set its minimum autoscale to 1, and it gradually scales down over time. Additionally, we delete the previous secondary API so we have the current primary deployment N, the secondary N-1, but there&#8217;s no need to keep the N-2 deployment, its HPA, or the associated Kubernetes service.</p></div></div>]]></content:encoded></item><item><title><![CDATA[ChatGPT function calls for multiple inputs in a single request]]></title><description><![CDATA[OpenAI recently released support for function calls in their chat completions API. Here I explore its use to classify multiple emails in a single request.]]></description><link>https://engineering.streak.com/p/chatgpt-function-calls-for-multiple</link><guid isPermaLink="false">https://engineering.streak.com/p/chatgpt-function-calls-for-multiple</guid><dc:creator><![CDATA[Blake Kadatz]]></dc:creator><pubDate>Fri, 23 Jun 2023 22:04:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_CXR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_CXR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_CXR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_CXR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_CXR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_CXR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_CXR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1118223,&quot;alt&quot;:&quot;ChatGPT web UI&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="ChatGPT web UI" title="ChatGPT web UI" srcset="https://substackcdn.com/image/fetch/$s_!_CXR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_CXR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_CXR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_CXR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5825ff71-e9c2-4f89-a1e0-0f8f982fda60_5096x3398.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>OpenAI recently released support for <a href="https://openai.com/blog/function-calling-and-other-api-updates">function calls</a> in their chat completions API. The example they give shows the ability to define a function such as <code>get_current_weather(location: string)</code> so that when a user asks &#8220;What&#8217;s the weather like in Boston right now?&#8221;, the chat completions API intelligently extracts parameters from the user&#8217;s input, responding with JSON to your server describing what function to call, eg: <code>name: get_current_weather, arguments: {"location": "Boston"}</code>. This allows you to extract the function name and argument values from the response, call an external weather API to get the temperature in Boston, then respond with data such as <code>{&#8220;temperature": &#8220;22&#8221;, &#8220;unit": &#8220;celsius", &#8220;description": &#8220;Sunny&#8221;}</code> which ChatGPT intelligently formats back as a response such as &#8220;The weather in Boston is currently sunny with a temperature of 22 degrees Celsius.&#8221;</p><p>Streak has a number of projects on our <a href="https://www.streak.com/post/streak-ai-roadmap">AI roadmap</a> and this appeared useful for one in particular. But first, a bit of background.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://engineering.streak.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Streak Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3>Solving the &#8220;blank slate&#8221;&nbsp;problem</h3><p>A challenge for most SaaS products is that when users sign up, they are left staring at a blank slate and are unsure where to start. A popular way to handle this is via your onboarding process by providing tutorial videos, helpful walkthroughs, context-sensitive help, and so on. These are often necessary but not sufficient as nothing beats having the user&#8217;s own data available to use within the SaaS product. We tackled this a while ago in our onboarding flow: when the user creates their first pipeline, our system looks at the user&#8217;s Gmail sent items and displays a &#8220;Quick Add&#8221; screen, offering to create boxes for the contacts they&#8217;ve sent emails to.</p><div id="youtube2-NKX711y3e90" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;NKX711y3e90&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/NKX711y3e90?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>This has been incredibly useful to customers as the blank slate is now replaced with their own data where they can see a number of <a href="https://support.streak.com/en/articles/2563465-magic-columns">magic columns</a> including when they last emailed the contact, when the contact last emailed them, and so on. It&#8217;s then easy to see who they should follow up with, review a history of all emails from everyone on their team, and take appropriate action.</p><h3>Experimenting with AI and forcing JSON&nbsp;output</h3><p>We thought a good upgrade from this functionality is to intelligently categorize <em>all</em> their emails. Rather than just looking at emails the user had recently sent, what if we looked at all recent email threads and used AI to categorize them based on our pipeline templates into Sales, Hiring, Job Search, Investor Relations, Fundraising, Real Estate, and so on? Someone in HR could use this to identify those who had applied as well as identify candidates the HR team had reached out to directly.</p><p>I tackled this as an R&amp;D project, using emails in my own account to see how well ChatGPT 3.5 could classify emails as either HIRING or NOT_HIRING. I used the chat completions API and divided the input into the system message, which provided some initial prompt engineering and some background. Part of the system message primed ChatGPT with few-shot prompting as well as the expected output as structured JSON:</p><pre><code><code>For example, given the inputs:

###

[0] Subject: Resume
Snippet: Hello, I've attached my resume and am applying for the content writer position

[1] Subject: Lunch on Tuesday
Snippet: Hi James, it's been a while! We should catch up this Tuesday as I wanted to go over the sales numbers

[2] Subject: Ad for Widgets
Snippet: Hi Doug, I'd like to be considered for the online ad posted on Facebook. Please find attached my job history.

###

You should output a JSON array with one object per email:
```
[ 
  {
    "index": 0, 
    "category": "HIRING", 
    "explanation": "Email mentions attaching a resume and that they are applying for a position" 
  }, 
  { 
    "index": 1, 
    "category": "NOT_HIRING", 
    "explanation": "Email appears unrelated to hiring"
  },
  {
    "index": 2,
    "category": "HIRING",
    "explanation": "Email mentions an online ad and they have included their job history"
  }
]</code></code></pre><p>Asking for an explanation is helpful to understand the rationale behind the category. The user message is then similar to the few-shot prompts but includes a trailing output hint:</p><pre><code>[0] Subject: Resume Snippet: Hello, I've attached my resume and am applying for the content writer position

[1] Subject: Lunch on Tuesday Snippet: Hi James, it's been a while! We should catch up this Tuesday as I wanted to go over the sales numbers

[2] Subject: Ad for Widgets Snippet: Hi Doug, I'd like to be considered for the online ad posted on Facebook. Please find attached my job history.

Output:
```</code></pre><p>The three backticks indicates that the JSON output is to follow, and ChatGPT reliably supplies the JSON array with objects as specified in the system message. This makes it easy to use a JSON parser to deserialize the text into our internal classes in Kotlin.</p><h3>Migrating to function&nbsp;calls</h3><p>To use function calls, you specify the function definitions using JSON schema. For our example, I want ChatGPT to call a <code>classify_email</code> function for a given message and provide the <code>category</code> and an <code>explanation</code>. The request JSON looks like the following:</p><pre><code><code>{
  "model": "gpt-3.5-turbo-0613",
  "messages": [
    {
      "role": "system",
      "content": "You are an assistant designed to analyze information from emails and assign the category that best matches the content."
    },
    {
      "role": "user",
      "content": "Subject: Resume\\nSnippet: Hello, I've attached my resume and am applying for the content writer position"
    }
  ],
  "functions": [
    {
      "name": "classify_email",
      "description": "Classify an email into a category",
      "parameters": {
        "type": "object",
        "properties": {
          "category": {
            "type": "string",
            "enum": [
              "SALES",
              "PROJECTS",
              "BUSINESS_DEV",
              "HIRING",
              "JOB_SEARCH",
              "INVESTOR",
              "FUNDRAISING",
              "ORDERS",
              "REAL_ESTATE",
              "SUPPORT",
              "UNKNOWN"
            ],
            "description": "The category to classify the email into"
          },
          "explanation": {
            "type": "string",
            "description": "An explanation of why the email belongs to the specified category"
          }
        },
        "required": ["category", "explanation"]
      }
    }
  ],
  "function_call": { "name": "classify_email" }
}
</code></code></pre><p>Here you can see there&#8217;s a system message which provides some background. <em>Note: I&#8217;ve omitted some additional information from the system message describing each category.</em> The user message supplies the subject and snippet for the email in question.</p><p>There are two important parts to making function calls work with ChatGPT. The first is supplying the <code>functions</code> parameter with the definition of your function. Because this uses JSON schema, I can supply a category property as an enum type and specify the allowable values. The second is the <code>function_call</code> property which, when you specify a name of the function to call, tells ChatGPT that it <em>must</em> call the function.</p><p>The response from the chat completions API indicates a function call with the information we&#8217;re looking for:</p><pre><code><code>{
  "id": "chatcmpl-&lt;redacted&gt;",
  "object": "chat.completion",
  "created": 1687490210,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "classify_email",
          "arguments": "{\\n  \\"category\\": \\"HIRING\\",  \\"explanation\\": \\"Email mentions attaching a resume and that they are applying for a position.\\"\\n}"
        }
      },
      "finish_reason": "stop"
    }
  ]
}
</code></code></pre><p>Unlike OpenAI&#8217;s weather example, we don&#8217;t need to call the API again to provide a response that it can intelligently respond with to the user; the goal here is to get back structured data and avoid having to coerce ChatGPT into transforming the output into JSON. I can trust that the <code>arguments</code> property is JSON already and it&#8217;s trivial to parse and extract the category and explanation.</p><p>However, this is only a single message. When processing hundreds or thousands of items, I&#8217;ve found there&#8217;s a sweet spot in optimizing your queries to minimize response time and maximize token utilization combined with making parallel API requests. Too many parallel requests means not only are you burning through tokens faster, but you end up getting rate limited with 429 errors. In my case, both the system message and function definition are identical across all emails. So I can cut down on token usage by supplying the system message, the function definition, and then multiple user messages (one for each email).</p><p>I had hoped ChatGPT would respond with one function call per user message but it only responded with a single function call, classifying only one single email. The trick turned out to be have the function accept an array of messages, plus specifying the object definition for each message separately.</p><p>Here&#8217;s what the complete request looks like:</p><pre><code><code>{
  "model": "gpt-3.5-turbo-0613",
  "messages": [
    {
      "role": "system",
      "content": "You are an assistant designed to analyze information from emails and assign the category that best matches the content."
    },
    {
      "role": "user",
      "content": "ThreadID: &lt;redacted&gt;\\nSubject: &lt;redacted&gt;\\nSnippet: &lt;redacted&gt;"
    },
    {
      "role": "user",
      "content": "ThreadID: &lt;redacted&gt;\\nSubject: &lt;redacted&gt;\\nSnippet: &lt;redacted&gt;"
    },
    {
      "role": "user",
      "content": "ThreadID: &lt;redacted&gt;\\nSubject: &lt;redacted&gt;\\nSnippet: &lt;redacted&gt;"
    },
    {
      "role": "user",
      "content": "ThreadID: &lt;redacted&gt;\\nSubject: &lt;redacted&gt;\\nSnippet: &lt;redacted&gt;"
    },
    {
      "role": "user",
      "content": "ThreadID: &lt;redacted&gt;\\nSubject: &lt;redacted&gt;\\nSnippet: &lt;redacted&gt;"
    },
    {
      "role": "user",
      "content": "ThreadID: &lt;redacted&gt;\\nSubject: &lt;redacted&gt;\\nSnippet: &lt;redacted&gt;"
    },
    {
      "role": "user",
      "content": "ThreadID: &lt;redacted&gt;\\nSubject: &lt;redacted&gt;\\nSnippet: &lt;redacted&gt;"
    },
    {
      "role": "user",
      "content": "ThreadID: &lt;redacted&gt;\\nSubject: &lt;redacted&gt;\\nSnippet: &lt;redacted&gt;"
    }
  ],
  "functions": [
    {
      "name": "classify_email",
      "description": "Called to classify emails",
      "parameters": {
        "type": "object",
        "properties": {
          "messages": {
            "type": "array",
            "items": { "$ref": "#/$defs/message" },
            "description": "Array of emails to classify"
          }
        },
        "$defs": {
          "message": {
            "type": "object",
            "properties": {
              "threadId": {
                "type": "string",
                "description": "The ThreadID of the email"
              },
              "category": {
                "type": "string",
                "enum": [
                  "SALES",
                  "PROJECTS",
                  "BUSINESS_DEV",
                  "HIRING",
                  "JOB_SEARCH",
                  "INVESTOR",
                  "FUNDRAISING",
                  "ORDERS",
                  "REAL_ESTATE",
                  "SUPPORT",
                  "UNKNOWN"
                ],
                "description": "The category to classify the email into"
              },
              "explanation": {
                "type": "string",
                "description": "An explanation of why the email belongs to the specified category"
              }
            },
            "required": ["threadId", "category", "explanation"],
            "description": "The message to classify"
          }
        },
        "required": ["messages"]
      }
    }
  ],
  "function_call": { "name": "classify_email" }
}
</code></code></pre><p>This turns the function from <code>classify_email(category: string, explanation: string)</code> into <code>classify_email(messages: array)</code> and further, this specifies that the items in the array are of type <code>messages</code> defined at <code>#/$defs/message</code>. The <code>$defs</code> property then specifies the JSON schema for the <code>message</code> object with the same parameters as before.</p><p>The response now calls my <code>classify_email</code> function and passes an array of messages, where each message contains the arguments needed to categorize each one:</p><pre><code><code>{
  "id": "chatcmpl-&lt;redacted&gt;",
  "object": "chat.completion",
  "created": 1687490210,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "classify_email",
          "arguments": "{\\n  \\"messages\\": [\\n    {\\n      \\"threadId\\": \\"&lt;redacted&gt;\\",\\n      \\"category\\": \\"UNKNOWN\\",\\n      \\"explanation\\": \\"The email does not fit into any specific category.\\"\\n    }\\n  ]\\n}"
        }
      },
      "finish_reason": "stop"
    }
  ]
}
</code></code></pre><p>I&#8217;ve truncated the <code>arguments</code> value above, but the JSON for this contains an array of <code>message</code> objects with argument values as defined in the schema:</p><pre><code><code>[
  {
    "threadId": "&lt;redacted&gt;",
    "category": "UNKNOWN",
    "explanation": "The email does not fit into any of the predefined categories."
  },
  {
    "threadId": "&lt;redacted&gt;",
    "category": "PROJECTS",
    "explanation": "The email is related to a project meeting invitation."
  },
  {
    "threadId": "&lt;redacted&gt;",
    "category": "PROJECTS",
    "explanation": "The email is related to a person going on call for a specific project."
  },
  {
    "threadId": "&lt;redacted&gt;",
    "category": "SUPPORT",
    "explanation": "The email is related to providing feedback and information about an Android app."
  },
  {
    "threadId": "&lt;redacted&gt;",
    "category": "UNKNOWN",
    "explanation": "The email does not fit into any of the predefined categories."
  },
  {
    "threadId": "&lt;redacted&gt;",
    "category": "UNKNOWN",
    "explanation": "The email does not fit into any of the predefined categories."
  },
  {
    "threadId": "&lt;redacted&gt;",
    "category": "UNKNOWN",
    "explanation": "The email does not fit into any of the predefined categories."
  }
]
</code></code></pre><p>And there it is! Beautifully structured JSON without having to burn through tokens coercing ChatGPT via few-shot prompting, and I can trust the the <code>category</code> property uses only the enum values I specified.</p><h3>Further improvements</h3><p>This is a good MVP for proving the concept. but there is still a ways to go to improve the categorization accuracy. The system can provide more of the email content rather than just the Gmail snippet, plus additional filtering up front can reduce the number of emails we need to classify by excluding things like automated emails.</p><h3>Conclusion</h3><p>ChatGPT still calls the function only once, but by using an array argument, ChatGPT transforms the result for each user message into an item in the array. This cuts down on the number of requests, avoiding getting rate limited with 429 responses, as well as optimizing the number of tokens used by sharing the system message and function definition across multiple items.</p><h3>Engineering at&nbsp;Streak</h3><p>We work on many interesting challenges affecting millions of users and many terabytes of data. For more information, visit <a href="https://www.streak.com/careers">https://www.streak.com/careers</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://engineering.streak.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Streak Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Building link tracking infrastructure]]></title><description><![CDATA[Using Cloudflare Workers to handle link tracking asynchronously]]></description><link>https://engineering.streak.com/p/building-link-tracking-infrastructure-f6eabd80977c</link><guid isPermaLink="false">https://engineering.streak.com/p/building-link-tracking-infrastructure-f6eabd80977c</guid><dc:creator><![CDATA[Blake Kadatz]]></dc:creator><pubDate>Tue, 31 May 2022 15:59:15 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/7e9ab6d9-c6bd-4d1c-8722-b7435f24aec1_580x548.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Streak has millions of users sending many millions of emails a day through their Gmail account. Our users love the ability to track when someone has viewed emails they send, whether it&#8217;s a one-off email or when sending a mail merge. We&#8217;ve had a long-standing desire to also include click tracking of any links within the email.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yX5d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yX5d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png 424w, https://substackcdn.com/image/fetch/$s_!yX5d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png 848w, https://substackcdn.com/image/fetch/$s_!yX5d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png 1272w, https://substackcdn.com/image/fetch/$s_!yX5d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yX5d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Gmail email compose window showing Streak&#8217;s link tracking toggle UI&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Gmail email compose window showing Streak&#8217;s link tracking toggle UI" title="Gmail email compose window showing Streak&#8217;s link tracking toggle UI" srcset="https://substackcdn.com/image/fetch/$s_!yX5d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png 424w, https://substackcdn.com/image/fetch/$s_!yX5d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png 848w, https://substackcdn.com/image/fetch/$s_!yX5d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png 1272w, https://substackcdn.com/image/fetch/$s_!yX5d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35391e7b-ed49-4a37-81a0-04d600cd2ab1_580x548.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a><figcaption class="image-caption">Streak link tracking toggle UI within&nbsp;Gmail</figcaption></figure></div><h3>Design goals</h3><p>In building this, we had a number of goals for the link tracking functionality:</p><ol><li><p><strong>Must work all the time.</strong> We need to ensure that any system we implement for link tracking is highly reliable. It&#8217;s one thing for our code to have to retry a request to the API if there&#8217;s a transient issue, but our users will be sending emails to their contacts where we don&#8217;t have such control, and errors would reflect not only poorly on Streak but also on our users.</p></li><li><p><strong>Must be fast.</strong> If you&#8217;ve ever received a marketing email that has links in it, you&#8217;ve likely clicked a link and it takes forever to go to the destination because some slow CRM system is processing the click. We don&#8217;t like building slow functionality.</p></li><li><p><strong>Links must always work.</strong> Because our users are sending emails to their contacts, any tracked links in those emails should work. Forever. This means we shouldn&#8217;t tie the implementation to any one specific technology or require any database of valid links.</p></li><li><p><strong>Revocable.</strong> When a Streak users sends out any email, the email goes out through their Gmail account, which has the benefit of Google&#8217;s extensive anti-spam measures, quota enforcement, best in class deliverability, and keeping all their email in one place. However, should we need to revoke a link (for example, due to malicious use), we want to be able to prevent those specific links from working. Note that this is at odds with links always working.</p></li><li><p><strong>Spoof proof.</strong> Links must only work if they were generated by Streak. When the system generates a link, any modification of the link should render the link invalid. This prevents a malicious attacker from taking a valid link and modifying the destination to go somewhere else, or otherwise circumventing checks added during link generation.</p></li></ol><h3>Technology</h3><p>Since Streak runs directly within Gmail and we are a Google Technology Partner, our infrastructure runs within GCP (Google Cloud Platform).</p><p>We considered Google Cloud Run as it seemed like a cool technology where you deploy a containerized microservice and leave it up to Google to manage. Also under consideration was Google Cloud Functions which is essentially a containerless version of Cloud Run and similar to AWS Lambda.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uQkX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uQkX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png 424w, https://substackcdn.com/image/fetch/$s_!uQkX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png 848w, https://substackcdn.com/image/fetch/$s_!uQkX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png 1272w, https://substackcdn.com/image/fetch/$s_!uQkX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uQkX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/924b676b-893f-4b4e-af3e-1476baf57481_256x37.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Cloudflare logo&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Cloudflare logo" title="Cloudflare logo" srcset="https://substackcdn.com/image/fetch/$s_!uQkX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png 424w, https://substackcdn.com/image/fetch/$s_!uQkX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png 848w, https://substackcdn.com/image/fetch/$s_!uQkX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png 1272w, https://substackcdn.com/image/fetch/$s_!uQkX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924b676b-893f-4b4e-af3e-1476baf57481_256x37.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><p>We eventually chose Cloudflare Workers to power our link tracking. Cloudflare boasts that their network reaches 95% of the world&#8217;s population within 50ms. Additionally, as they power some of the largest properties on the web we were comfortable that their infrastructure&#8217;s reliability meets or exceeds our own.</p><p>Code written for Workers runs directly in each of their edge locations, meaning we get fast distribution across more than 200 points of presence. Additionally, since the code is isolated from our existing infrastructure it&#8217;s not dependent on whichever technology stack we happen to use.</p><h3>Constructing a link specification</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!orEm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!orEm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg 424w, https://substackcdn.com/image/fetch/$s_!orEm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg 848w, https://substackcdn.com/image/fetch/$s_!orEm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!orEm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!orEm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Blueprint construction plans&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Blueprint construction plans" title="Blueprint construction plans" srcset="https://substackcdn.com/image/fetch/$s_!orEm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg 424w, https://substackcdn.com/image/fetch/$s_!orEm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg 848w, https://substackcdn.com/image/fetch/$s_!orEm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!orEm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c107806-ea85-488c-b809-4904e1d83113_800x467.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>When users send emails out through Streak, our system records which links are within which emails. Then, when a recipient clicks on a link, our system associates the click with the correct email. And because Streak has millions of users sending, cumulatively, millions of emails, maintaining a copy of all that data on Cloudflare or anywhere else was a non-starter. So we needed a much simpler system.</p><p>The format of the URL that we settled on looks like this:</p><p><code>https://somestreakdomain.ext/uniqueidentifier/destinationurl</code></p><p>This consists of the following parts:</p><ol><li><p>The domain that we host the link on. For our links, we wanted to be completely transparent about the functionality and chose <code>streaklinks.com</code> for the domain.</p></li><li><p>The destination URL. Because we want the links to work regardless of platform or infrastructure, we chose a simple URL encoded version of the destination. So if you are linking to <code>https://example.com</code> then the encoded version is <code>https%3A%2F%2fexample.com</code>. While simple URL encoding does make a link easily decodable by a recipient should Streak later block the URL, the main goal is that Streak&#8217;s infrastructure isn&#8217;t directly used to link to an unwanted site.</p></li><li><p>This leaves the unique identifier. As mentioned, we use this to later associate which link in which email the recipient clicked on. At the same time, we don&#8217;t want to perform an expensive database lookup nor do we want a malicious attacker to be able to spoof arbitrary URLs in our system. The solution is that the identifier combines an identifier within Streak&#8217;s CRM with a cryptographic checksum of the data.</p></li></ol><p>Because the complete URL contains all the data necessary to both validate the URL and redirect the user to the destination, we don&#8217;t need to maintain any storage.</p><h3>Validating a&nbsp;link</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NG0l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NG0l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NG0l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NG0l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NG0l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NG0l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Abstract image of display showing Matrix code&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Abstract image of display showing Matrix code" title="Abstract image of display showing Matrix code" srcset="https://substackcdn.com/image/fetch/$s_!NG0l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NG0l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NG0l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NG0l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f8cbc7d-5f33-4ca8-93e0-be7419d43e4c_640x427.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>When a recipient clicks on a link, we extract the unique identifier and destination URL from the request. Validation is then a simple matter of performing an HMAC SHA-256 calculation of the unique identifier and the destination URL, which we use to ensure the integrity and authenticity of the link.</p><p>Cloudflare has some docs on Web Crypto at <a href="https://developers.cloudflare.com/workers/runtime-apis/web-crypto/">https://developers.cloudflare.com/workers/runtime-apis/web-crypto/</a> which outlines how to calculate a message digest. In our case, this looks something like this:</p><pre><code>const encoder = new TextEncoder();
const secretKeyData = encoder.encode(SECRET_KEY);
const key = await crypto.subtle.importKey(
  'raw',
  secretKeyData,
  {name: 'HMAC', hash: 'SHA-256'},
  false,
  ['sign']
);
const mac = await crypto.subtle.sign('HMAC', key, encoder.encode(message));</code></pre><p>In the above code, <code>SECRET_KEY</code> is a secret that has been set via wrangler:</p><pre><code>wrangler secret put SECRET_KEY</code></pre><p>The value of this secret key is shared between Cloudflare and our API code which generated the link in the first place. The <code>mac</code> (Message Authentication Code) value contains an <code>ArrayBuffer</code> with the bytes, which we extract and use to verify that the link was generated by our API without having to contact the API or lookup any data.</p><p>Additionally, we store a blocklist of undesirable links in Cloudflare Workers KV (a key/value store) and we verify that the link has not been blocklisted. Workers makes this quite simple. In our case, we created a new KV namespace like <code>blocklisted-links</code> and then bound it to the worker using the name <code>BLOCKLIST</code>. To check if a destination URL is on the blocklist, it&#8217;s simply:</p><pre><code>const MAX_KEY_LENGTH = 512;
const blocklistKey = ('url:' + destinationUrl).substring(0, MAX_KEY_LENGTH);
const blocklistEntry = await BLOCKLIST.get(blocklistKey);</code></pre><p>If <code>blocklistEntry</code> is anything other than <code>null</code>, that means we&#8217;ve blocklisted the URL and we can reject the request. Note the use of the <code>url:</code> prefix here for checking the full URL. This allows us to use various prefixes to expand the blocklist to support any kind of blocking, whether via hostname, ip address, user-agent, and so on.</p><p>If everything checks out, the Cloudflare Worker code immediately redirects the user to the destination:</p><pre><code>return new Response('', {
  status: 302,
  headers: {
    location: destinationUrl
  }
});</code></pre><h3>Overall request&nbsp;flow</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ltzN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ltzN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png 424w, https://substackcdn.com/image/fetch/$s_!ltzN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png 848w, https://substackcdn.com/image/fetch/$s_!ltzN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png 1272w, https://substackcdn.com/image/fetch/$s_!ltzN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ltzN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Flowchart visualizing description of request flow&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart visualizing description of request flow" title="Flowchart visualizing description of request flow" srcset="https://substackcdn.com/image/fetch/$s_!ltzN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png 424w, https://substackcdn.com/image/fetch/$s_!ltzN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png 848w, https://substackcdn.com/image/fetch/$s_!ltzN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png 1272w, https://substackcdn.com/image/fetch/$s_!ltzN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bb11b52-6bb3-468f-a2d1-a23ec6437bf9_745x1029.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Fast performance</h3><p>The Cloudflare Worker code redirects requests to the destination URL in single digit milliseconds. Real world performance testing shows that link redirection can occur in approximately 20ms including making the request, processing the request inside the Cloudflare Worker code, and sending the redirection response. Most requests are so quick that the browser does not have time to display the streaklinks.com domain before the request to the destination occurs.</p><p>Behind the scenes, we take advantage of the fact that inside a Worker, fetch events (see <a href="https://developers.cloudflare.com/workers/runtime-apis/fetch-event/">https://developers.cloudflare.com/workers/runtime-apis/fetch-event/</a>) run asynchronously and live beyond the lifecycle of the user request. While the email recipient&#8217;s click has already been redirected to their destination, the Worker code pushes the click data to our system where we update the Streak user&#8217;s CRM data:</p><pre><code>event.waitUntil(postToStreak(data));</code></pre><p>We run this code just prior to returning the redirection response. Here, <code>data</code> is an object that contains information about the request including unique ID, destination, timestamp, and so on. When processed, our API associates this data with the original sent email.</p><h3>Launching link&nbsp;tracking</h3><p>Soon after we launched, we saw some incredible success with the system. Cloudflare&#8217;s performance excelled at our goal of having very fast redirection and handled a massive number of requests:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Giz-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Giz-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png 424w, https://substackcdn.com/image/fetch/$s_!Giz-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png 848w, https://substackcdn.com/image/fetch/$s_!Giz-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png 1272w, https://substackcdn.com/image/fetch/$s_!Giz-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Giz-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Graph of Cloudflare Worker performance showing 986K requests, duration of 10.8k GB-sec, and median CPU time of 1.5 ms&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Graph of Cloudflare Worker performance showing 986K requests, duration of 10.8k GB-sec, and median CPU time of 1.5 ms" title="Graph of Cloudflare Worker performance showing 986K requests, duration of 10.8k GB-sec, and median CPU time of 1.5 ms" srcset="https://substackcdn.com/image/fetch/$s_!Giz-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png 424w, https://substackcdn.com/image/fetch/$s_!Giz-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png 848w, https://substackcdn.com/image/fetch/$s_!Giz-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png 1272w, https://substackcdn.com/image/fetch/$s_!Giz-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c8bb80f-1379-4894-99d4-1d7888abb3ff_800x478.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Geeking out on&nbsp;logs</h3><p>Cloudflare lets you easily live stream requests using <a href="https://developers.cloudflare.com/workers/cli-wrangler/commands/#tail">wrangler</a>. A simple <code>wrangler tail -f pretty</code> from the terminal and requests flood in, including any console logging performed within the worker.</p><pre><code>console.error('Something went wrong!');
console.debug('Everything is good');</code></pre><p>See <a href="https://developers.cloudflare.com/workers/learning/logging-workers/">https://developers.cloudflare.com/workers/learning/logging-workers/</a> for details. During development, this was invaluable to quickly verify details of the request and response. And once rolled out to production, streaming only errors caught a few issues where we weren&#8217;t gracefully handling invalid requests. And, when we want to see everything, we stream the debug logs as well.</p><p>In examining the logs, there were a few interesting things to note:</p><ol><li><p>Browsers, of course, attempt to fetch a <code>favicon.ico</code> file. We didn&#8217;t plan to serve one initially, but it was an easy matter of serving up Streak&#8217;s favicon directly from the Worker. Similarly for people who are curious about what <a href="http://streaklinks.com">streaklinks.com</a> is all about, when they visit <a href="https://streaklinks.com/">https://streaklinks.com/</a> we redirect them to <a href="https://www.streak.com/">https://www.streak.com/</a></p></li><li><p>Anyone who has run any internet-facing system is familiar with automated exploit attempts. Soon after the domain went live, we saw all manner of malicious traffic. Fortunately, Cloudflare doesn&#8217;t break a sweat with all those requests and the Worker code trivially rejects invalid requests without effort.</p></li></ol><p>As part of our commitment to overall security, we allow ethical hackers to try and discover vulnerabilities our systems via Hacker One and link tracking is fair game. More information at <a href="https://www.streak.com/security">https://www.streak.com/security</a></p><h3>Wrapping it&nbsp;up</h3><p>Exploring the benefits that Cloudflare has to offer has been fun. With the new feature successfully launched to everyone, Streak users now get additional valuable insights into how recipients are interacting with their emails without having to use third party tools or leave Gmail.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rKD6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rKD6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png 424w, https://substackcdn.com/image/fetch/$s_!rKD6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png 848w, https://substackcdn.com/image/fetch/$s_!rKD6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png 1272w, https://substackcdn.com/image/fetch/$s_!rKD6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rKD6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Gmail email compose window showing Streak&#8217;s link tracking toggle UI&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Gmail email compose window showing Streak&#8217;s link tracking toggle UI" title="Gmail email compose window showing Streak&#8217;s link tracking toggle UI" srcset="https://substackcdn.com/image/fetch/$s_!rKD6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png 424w, https://substackcdn.com/image/fetch/$s_!rKD6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png 848w, https://substackcdn.com/image/fetch/$s_!rKD6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png 1272w, https://substackcdn.com/image/fetch/$s_!rKD6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa071bcb0-d0e4-4b11-981b-c49a094efcee_580x548.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Cloudflare Workers has proven to be extremely reliable and fast and our link tracking functionality has operated flawlessly. Having proven the benefits of Cloudflare Workers as part of our engineering toolkit, we now have a reference on how to build out fast, reliable functionality that integrates with our existing infrastructure.</p><h3>Engineering at&nbsp;Streak</h3><p>We work on many interesting challenges affecting millions of users and many terabytes of data. For more information, visit <a href="https://www.streak.com/careers">https://www.streak.com/careers</a></p>]]></content:encoded></item></channel></rss>