Sorry

This feed does not validate.

In addition, interoperability with the widest range of feed readers could be improved by implementing the following recommendations.

Source: http://tech.adroll.com/feed.xml

  1. <?xml version="1.0"?>
  2. <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  3.  <channel>
  4.    <title></title>
  5.    <link>https://tech.nextroll.com</link>
  6.    <atom:link href="https://tech.nextroll.com/feed.xml" rel="self" type="application/rss+xml" />
  7.    <description></description>
  8.    <language>en-us</language>
  9.    <pubDate>Fri, 15 Dec 2023 12:53:00 -0800</pubDate>
  10.    <lastBuildDate>Fri, 15 Dec 2023 12:53:00 -0800</lastBuildDate>
  11.  
  12.    
  13.    
  14.    <item>
  15.      <title>Celebrating Innovation - NextRoll&apos;s Hack Week 2023 H2 Winners</title>
  16.      <link>https://tech.nextroll.com/blog/hackweek/2023/09/15/hack-week.html</link>
  17.      <pubDate>Fri, 15 Sep 2023 00:00:00 -0700</pubDate>
  18.      <author></author>
  19.      <guid isPermaLink="false">https://tech.nextroll.com/blog/hackweek/2023/09/15/hack-week</guid>
  20.      <description>&lt;p&gt;NextRoll’s Hack Week 2023 was a week filled with ingenuity, collaboration, and groundbreaking ideas. From AI-powered solutions to production optimization, our teams went above and beyond to showcase their talents. Without further ado, let’s dive into the winning projects!&lt;/p&gt;
  21.  
  22. &lt;h1 id=&quot;aiml-award-the-future-is-now&quot;&gt;AI/ML Award: The Future is Now!&lt;/h1&gt;
  23.  
  24. &lt;h3 id=&quot;smartdoc&quot;&gt;SmartDoc&lt;/h3&gt;
  25. &lt;p&gt;&lt;strong&gt;Winners: Federico Della Rovere, Lorenzo Savini&lt;/strong&gt;&lt;/p&gt;
  26.  
  27. &lt;p&gt;SmartDoc is more than an AI-powered document management system. By leveraging machine learning, it categorizes, sorts, and manages documents with unprecedented accuracy, significantly reducing manual labor and improving data retrieval processes. This innovation is a step forward in automating administrative tasks, allowing teams to focus on more strategic activities.&lt;/p&gt;
  28.  
  29. &lt;p&gt;&lt;img src=&quot;/images/post_images/smartdoc.png&quot; alt=&quot;SmartDoc&quot; /&gt;&lt;/p&gt;
  30.  
  31. &lt;h2 id=&quot;product-award-figma-plugin-for-ad-production-optimization&quot;&gt;Product Award: Figma Plugin for Ad Production Optimization&lt;/h2&gt;
  32. &lt;p&gt;&lt;strong&gt;Winners: Jake Burroughs, Jose Hernandez, Joris Korbee, Aasif Shabbir, Eurico Nicacio&lt;/strong&gt;&lt;/p&gt;
  33.  
  34. &lt;p&gt;This Figma Plugin transforms ad production by enabling the generation of HTML5 ads in formats beyond the standard sizes. Its AI-supported animated ad templates not only enhance creativity but also speed up the production process, offering our clients more variety and quicker turnaround times. This is a significant leap in ad customization and efficiency.&lt;/p&gt;
  35.  
  36. &lt;p&gt;&lt;img src=&quot;/images/post_images/figma_plugin.png&quot; alt=&quot;HTML5 Ad Generator Figma Plugin&quot; /&gt;&lt;/p&gt;
  37.  
  38. &lt;h2 id=&quot;technical-award-puertagma&quot;&gt;Technical Award: Puertagma&lt;/h2&gt;
  39. &lt;p&gt;&lt;strong&gt;Winners: Oscar Arbelaez, Peter Kuimelis, Cristian Rojas, Christopher Ramirez, Alvaro Tuso&lt;/strong&gt;&lt;/p&gt;
  40.  
  41. &lt;p&gt;Puertagma is not just another API. It’s the culmination of our learnings from past projects, crafted into a robust and efficient tool. Written in Go, this API is set to become the backbone of our future integrations, promising enhanced performance and scalability. This project represents a significant stride in our technical capabilities, paving the way for more seamless and powerful integrations.&lt;/p&gt;
  42.  
  43. &lt;p&gt;Congratulations to all the winners and participants! Your hard work, innovative thinking, and focus on impactful solutions are what continue to drive NextRoll forward as a leader in the ad tech industry.&lt;/p&gt;
  44.  
  45. &lt;p&gt;&lt;img src=&quot;/images/post_images/puertagma.png&quot; alt=&quot;Puertagma&quot; /&gt;&lt;/p&gt;
  46. </description>
  47.    </item>
  48.    
  49.    
  50.    
  51.    <item>
  52.      <title>Exploring Monads with JavaScript</title>
  53.      <link>https://tech.nextroll.com/blog/dev/2022/11/11/exploring-monads-javascript.html</link>
  54.      <pubDate>Fri, 11 Nov 2022 00:00:00 -0800</pubDate>
  55.      <author></author>
  56.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2022/11/11/exploring-monads-javascript</guid>
  57.      <description>&lt;p&gt;I know that I’m some years (decades?) late, but here’s my take on writing a
  58. monad tutorial.&lt;/p&gt;
  59.  
  60. &lt;p&gt;When you start learning about Functional Programming, sooner or later, you will
  61. hear about “monads” - the secret handshake for functional programmers.&lt;/p&gt;
  62.  
  63. &lt;p&gt;For those coming from imperative languages, there’s a considerable
  64. learning curve to conquer before venturing into dealing with monads.&lt;/p&gt;
  65.  
  66. &lt;p&gt;Here we will see how the sausage is made in a mainstream language, JavaScript.
  67. I hope that it will help you visualize how the data plumbing works.
  68. After we understand the concept, we can also see how monads look in
  69. functional languages.&lt;/p&gt;
  70.  
  71. &lt;p&gt;Before we start, a warning: this is a more practical guide. We are going to skip
  72. some of the math involved. If the concepts involved happen to light your
  73. interest, the book “&lt;a href=&quot;https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/&quot;&gt;Category Theory for Programmers&lt;/a&gt;” is beneficial.&lt;/p&gt;
  74.  
  75. &lt;h1 id=&quot;not-just-lists&quot;&gt;Not just lists&lt;/h1&gt;
  76.  
  77. &lt;p&gt;One of the mistakes I see people making when they try concisely
  78. explaining monads is comparing them to lists.&lt;/p&gt;
  79.  
  80. &lt;p&gt;It is very tempting to use this example because lists and arrays are common
  81. data structures and programmers coming from imperative languages are familiar
  82. with at least one of these. The argument is quite appealing.&lt;/p&gt;
  83.  
  84. &lt;p&gt;This line of thinking can make newcomers believe that if they &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.flatMap&lt;/code&gt;
  85. a list, they explored all that monads offer. It goes pretty beyond
  86. that.&lt;/p&gt;
  87.  
  88. &lt;p&gt;Imagine someone asking what a &lt;em&gt;machine&lt;/em&gt; is and then getting the response:
  89. “a &lt;em&gt;car&lt;/em&gt; is a machine.” While this is true, one must be aware that there are
  90. many other types of machines: rockets, phones, clocks, etc.&lt;/p&gt;
  91.  
  92. &lt;h1 id=&quot;maybe-javascript&quot;&gt;Maybe JavaScript&lt;/h1&gt;
  93.  
  94. &lt;p&gt;For this tutorial, instead of lists, we will implement a data type
  95. called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt;.
  96. This data type helps when working with values that may or may not exist, avoiding
  97. common JS errors like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Cannot read properties of null&lt;/code&gt;.
  98. It is pretty common in functional languages - but if this is your first time
  99. seeing it, don’t worry - we will use just a basic implementation.
  100. Here are the minimal utility functions that we need to work with this data type:&lt;/p&gt;
  101.  
  102. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  103.  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;===&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  104.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  105.      &lt;span class=&quot;na&quot;&gt;tag&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;nothing&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  106.    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  107.  &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
  108.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  109.      &lt;span class=&quot;na&quot;&gt;tag&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;just&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  110.      &lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  111.    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  112. &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  113.  
  114. &lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;exists&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;someMaybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;someMaybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;tag&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;===&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;just&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  115. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  116.  
  117. &lt;p&gt;The code is pretty self-explanatory: some value comes in, and we check if it is
  118. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;null&lt;/code&gt; and then we return a wrapper with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tag&lt;/code&gt; property that labels which
  119. kind of data is being held. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exists&lt;/code&gt; function allows us to check the value
  120. of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tag&lt;/code&gt;.&lt;/p&gt;
  121.  
  122. &lt;p&gt;We can imagine that this code receives some &lt;em&gt;nullable thing&lt;/em&gt; and returns a
  123. &lt;em&gt;maybe thing&lt;/em&gt;, where &lt;em&gt;thing&lt;/em&gt; could be &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;string&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;number&lt;/code&gt; or whatever.&lt;/p&gt;
  124.  
  125. &lt;p&gt;When talking about the types involved in this operation, we can replace the
  126. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;string&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;number&lt;/code&gt; part with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt;, and the &lt;em&gt;maybe&lt;/em&gt; part with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m&lt;/code&gt;. A function
  127. returning something can be described as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-&amp;gt;&lt;/code&gt;.
  128. By doing this, we can say that the type of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maybe&lt;/code&gt; function is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a -&amp;gt; m a&lt;/code&gt;.&lt;/p&gt;
  129.  
  130. &lt;p&gt;Working with values wrapped in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt; will look like this:&lt;/p&gt;
  131.  
  132. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;sum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  133.  
  134. &lt;span class=&quot;cm&quot;&gt;/** Receives two maybes (ma and mb) */&lt;/span&gt;
  135. &lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  136.  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  137.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  138.  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  139.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  140.  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  141. &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  142.  
  143. &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  144. &lt;span class=&quot;c1&quot;&gt;// 5;&lt;/span&gt;
  145.  
  146. &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  147. &lt;span class=&quot;c1&quot;&gt;// null;&lt;/span&gt;
  148. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  149.  
  150. &lt;p&gt;As you can see, we are not gaining much by using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt; instead of checking if
  151. a value equals &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;null&lt;/code&gt; - we are just calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exists&lt;/code&gt; all the time.
  152. Another code smell is that we are returning &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;null&lt;/code&gt; in the “failure” path.
  153. We threw away our safety harness and now we are letting &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;null&lt;/code&gt;s wander inside
  154. our app without adult supervision.&lt;/p&gt;
  155.  
  156. &lt;h1 id=&quot;map-all-the-way&quot;&gt;Map all the way!&lt;/h1&gt;
  157.  
  158. &lt;p&gt;This can be improved by using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt;, enabling us to provide the hypothetical
  159. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m a&lt;/code&gt; value to a function, if said function accepts an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt;:&lt;/p&gt;
  160.  
  161. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  162.  &lt;span class=&quot;nx&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;
  163.  
  164. &lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;addThree&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
  165.  
  166. &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;addThree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  167. &lt;span class=&quot;c1&quot;&gt;// {&lt;/span&gt;
  168. &lt;span class=&quot;c1&quot;&gt;//   tag: &quot;just&quot;,&lt;/span&gt;
  169. &lt;span class=&quot;c1&quot;&gt;//   value: 6&lt;/span&gt;
  170. &lt;span class=&quot;c1&quot;&gt;// }&lt;/span&gt;
  171. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  172.  
  173. &lt;p&gt;So far, so good. Now we can “inject” a function to operate on values that might
  174. not exist.
  175. We also get a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt; back, instead of a raw value. It might be
  176. tempting to “extract” the value after we are done, but keeping it “wrapped”
  177. ensures that consumers of our code will be able to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt; their functions safely
  178. on it. This also makes more sense than returning &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;null&lt;/code&gt;: if we are adding a
  179. number to something that might not exist, then the result might not exist as
  180. well.
  181. Tip: once you step into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt; land, stay there (otherwise you are lying to
  182. yourself and not handling all possible states properly).&lt;/p&gt;
  183.  
  184. &lt;h1 id=&quot;lets-make-it-harder&quot;&gt;Let’s make it harder&lt;/h1&gt;
  185.  
  186. &lt;p&gt;In the real world, things are usually more complicated than this example.
  187. We are probably going to have multiple “maybe” values coming in.
  188. Let’s see how we can sum two nullable values - for that, we can use a nested
  189. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt; to access both values:&lt;/p&gt;
  190.  
  191. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  192.  &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  193.    &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  194.      &lt;span class=&quot;nx&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  195.    &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  196.  &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  197.  
  198. &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  199. &lt;span class=&quot;c1&quot;&gt;// {&lt;/span&gt;
  200. &lt;span class=&quot;c1&quot;&gt;//   tag: &apos;just&apos;,&lt;/span&gt;
  201. &lt;span class=&quot;c1&quot;&gt;//   value: {&lt;/span&gt;
  202. &lt;span class=&quot;c1&quot;&gt;//     tag: &apos;just&apos;,&lt;/span&gt;
  203. &lt;span class=&quot;c1&quot;&gt;//     value: 5,&lt;/span&gt;
  204. &lt;span class=&quot;c1&quot;&gt;//   },&lt;/span&gt;
  205. &lt;span class=&quot;c1&quot;&gt;// };&lt;/span&gt;
  206. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  207.  
  208. &lt;p&gt;FP folks are probably screaming “Use apply!!”, but let’s restrict the scope
  209. of this tutorial.&lt;/p&gt;
  210.  
  211. &lt;p&gt;These nested &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt;’s are hard to read - and there’s also an unfortunate effect:
  212. We got a nested &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt; as a result!
  213. If we add more variables (as in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map (map (map ...))&lt;/code&gt;), more nested levels will
  214. be added as well.
  215. Working with this type of result is possible but burdensome:&lt;/p&gt;
  216.  
  217. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mma&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  218.  
  219. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  220.  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  221.    &lt;span class=&quot;nx&quot;&gt;console&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  222.  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  223. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  224.  
  225. &lt;span class=&quot;c1&quot;&gt;//or by mapping:&lt;/span&gt;
  226.  
  227. &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  228.  &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  229.    &lt;span class=&quot;nx&quot;&gt;console&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  230.  &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  231. &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  232. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  233.  
  234. &lt;h1 id=&quot;crunching-data&quot;&gt;Crunching data&lt;/h1&gt;
  235.  
  236. &lt;p&gt;Let’s create a function that will help ease the pain of working with nested
  237. values, as this nesting got out of hand quickly.&lt;/p&gt;
  238.  
  239. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mma&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  240. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  241.  
  242. &lt;p&gt;This is a very simple function:&lt;/p&gt;
  243.  
  244. &lt;ul&gt;
  245.  &lt;li&gt;If we are in the “nothing” case, return it (as “nothing” can have no
  246. children).&lt;/li&gt;
  247.  &lt;li&gt;Otherwise, ignore the outer layer and return the inner, wrapped value.&lt;/li&gt;
  248. &lt;/ul&gt;
  249.  
  250. &lt;p&gt;The type of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;join&lt;/code&gt; is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m (m a) -&amp;gt; m a&lt;/code&gt;.&lt;/p&gt;
  251.  
  252. &lt;p&gt;Now, for each variable that we add, a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;join&lt;/code&gt; call needs to be added as well:&lt;/p&gt;
  253.  
  254. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  255.  &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  256.    &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  257.      &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  258.        &lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;
  259.      &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  260.    &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  261.  &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  262.  
  263. &lt;span class=&quot;c1&quot;&gt;// Notice that we have two `join(map(` segments&lt;/span&gt;
  264. &lt;span class=&quot;c1&quot;&gt;// and one `map(` segment. This will be revelant below.&lt;/span&gt;
  265. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  266.  
  267. &lt;p&gt;This is getting even worse to read! Let’s try refactoring this.&lt;/p&gt;
  268.  
  269. &lt;p&gt;First, let’s replace the innermost, solitary, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map(&lt;/code&gt; segment with
  270. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;join(map&lt;/code&gt;. The result looks like this:&lt;/p&gt;
  271.  
  272. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  273.  &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  274.    &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  275.      &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  276.        &lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;
  277.      &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  278.    &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  279.  &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  280.  
  281. &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  282. &lt;span class=&quot;c1&quot;&gt;// 3;&lt;/span&gt;
  283. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  284.  
  285. &lt;p&gt;This works, but the resulting value is not wrapped inside a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt;. If our
  286. inputs may not exist, coherence dictates that the result might not exist as well.
  287. To fix this, we can add a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maybe&lt;/code&gt; around the innermost expression (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a + b + c&lt;/code&gt;):&lt;/p&gt;
  288.  
  289. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  290.  &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  291.    &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  292.      &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  293.        &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  294.      &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  295.    &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  296.  &lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  297.  
  298. &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  299. &lt;span class=&quot;c1&quot;&gt;// {&lt;/span&gt;
  300. &lt;span class=&quot;c1&quot;&gt;//   tag: &apos;just&apos;,&lt;/span&gt;
  301. &lt;span class=&quot;c1&quot;&gt;//   value: 3,&lt;/span&gt;
  302. &lt;span class=&quot;c1&quot;&gt;// };&lt;/span&gt;
  303. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  304.  
  305. &lt;p&gt;Now we have a working solution, but it looks like a letter soup.
  306. Let’s try combining (&lt;em&gt;composing&lt;/em&gt;) the repeated pattern &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;join(map&lt;/code&gt; into a new
  307. function:&lt;/p&gt;
  308.  
  309. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;bind&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  310. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  311.  
  312. &lt;p&gt;As you can see &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt; equals “&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt;, then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;join&lt;/code&gt;”.&lt;/p&gt;
  313.  
  314. &lt;p&gt;Now let’s replace all &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;join(map&lt;/code&gt; expressions with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt;.&lt;/p&gt;
  315.  
  316. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  317.  &lt;span class=&quot;nx&quot;&gt;bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  318.    &lt;span class=&quot;nx&quot;&gt;bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  319.      &lt;span class=&quot;nx&quot;&gt;bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  320.        &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  321.      &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  322.    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  323.  &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  324.  
  325. &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  326. &lt;span class=&quot;c1&quot;&gt;// {&lt;/span&gt;
  327. &lt;span class=&quot;c1&quot;&gt;//   tag: &apos;just&apos;,&lt;/span&gt;
  328. &lt;span class=&quot;c1&quot;&gt;//   value: 6,&lt;/span&gt;
  329. &lt;span class=&quot;c1&quot;&gt;// };&lt;/span&gt;
  330. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  331.  
  332. &lt;p&gt;Now even if we add more variables, they will all collapse until a single, flat
  333. maybe value, because we call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;join&lt;/code&gt; once per level.&lt;/p&gt;
  334.  
  335. &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt; function allowed our nested structure to be executed as a sequence of
  336. steps.&lt;/p&gt;
  337.  
  338. &lt;p&gt;By the way, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt;’s type signature is quite popular: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m a -&amp;gt; (a -&amp;gt; m b) -&amp;gt; m b&lt;/code&gt;.&lt;/p&gt;
  339.  
  340. &lt;p&gt;The final code doesn’t look much pretty, as those nested functions can be
  341. confusing to work with. Other languages have features that make working
  342. with this type of expression easier.&lt;/p&gt;
  343.  
  344. &lt;h1 id=&quot;entering-the-big-m-generalizing-plumbing&quot;&gt;Entering the Big M: Generalizing Plumbing&lt;/h1&gt;
  345.  
  346. &lt;p&gt;Now you learned how to create a system that performs a sequence of computations
  347. involving &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt;.&lt;/p&gt;
  348.  
  349. &lt;p&gt;But now imagine Matrix’s Morpheus coming into the scene and saying:&lt;/p&gt;
  350.  
  351. &lt;p&gt;“What if I told you that this pattern can be generalized to other data types?”&lt;/p&gt;
  352.  
  353. &lt;p&gt;Remember when we saw &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt;’s type? That &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m&lt;/code&gt; doesn’t stand for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt;. It’s
  354. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Monad&lt;/code&gt;.
  355. When we say &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m a&lt;/code&gt;, we are not talking about a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe a&lt;/code&gt;, but any “&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m&lt;/code&gt;onadic” type
  356. that holds an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt;. In the FP world there are many such types: a list (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List a&lt;/code&gt;), a
  357. side effect &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IO a&lt;/code&gt;, something that may contain an error (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Either err a&lt;/code&gt;), etc.&lt;/p&gt;
  358.  
  359. &lt;p&gt;This pattern is useful because we can generalize the act of “Work with this
  360. data, running the functions in sequence. If something goes wrong, short
  361. circuit”.&lt;/p&gt;
  362.  
  363. &lt;p&gt;The necessary elements to create this protocol are:&lt;/p&gt;
  364.  
  365. &lt;ul&gt;
  366.  &lt;li&gt;the capability of “flattening” our data structure with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;join&lt;/code&gt;: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m (m a) -&amp;gt; m a&lt;/code&gt;;&lt;/li&gt;
  367.  &lt;li&gt;when calling our sum (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a + b + c&lt;/code&gt;) function, we need to place the result in
  368. the same data structure that we are using. In our code we called it &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maybe&lt;/code&gt;, but
  369. in the protocol, it has a different name: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;return&lt;/code&gt;. Its type is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a -&amp;gt; m a&lt;/code&gt;.
  370. Depending on the context, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;return&lt;/code&gt; can also be called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pure&lt;/code&gt;.&lt;/li&gt;
  371. &lt;/ul&gt;
  372.  
  373. &lt;p&gt;The catch is that each datatype handles “wrapping”, “collapsing” and “short
  374. circuiting” differently, depending on what these actions mean for that type.
  375. This is very important for FP, because this enables the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IO&lt;/code&gt; monad to run
  376. actions with side effects, then wrap the results back into lazy functions.&lt;/p&gt;
  377.  
  378. &lt;h1 id=&quot;how-this-looks-outside-js-land&quot;&gt;How this looks outside JS-land&lt;/h1&gt;
  379.  
  380. &lt;p&gt;I think that we pushed JavaScript as far as it can go. Working with this kind
  381. of expression is easier for functional languages, as they eat “wrapped values” for
  382. breakfast. (We can go further if we use a JS-library like fp-ts or Sanctuary).&lt;/p&gt;
  383.  
  384. &lt;p&gt;In functional languages, you can either call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt; directly or use it as an
  385. operator &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;gt;&amp;gt;=&lt;/code&gt;.&lt;/p&gt;
  386.  
  387. &lt;p&gt;Our example from above:&lt;/p&gt;
  388.  
  389. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  390.  &lt;span class=&quot;nx&quot;&gt;bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  391.    &lt;span class=&quot;nx&quot;&gt;bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  392.      &lt;span class=&quot;nx&quot;&gt;bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  393.        &lt;span class=&quot;nx&quot;&gt;maybe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  394.      &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  395.    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  396.  &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  397. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  398.  
  399. &lt;p&gt;In a language like Haskell, the code becomes:&lt;/p&gt;
  400.  
  401. &lt;div class=&quot;language-haskell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ma&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
  402.  &lt;span class=&quot;n&quot;&gt;ma&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  403.    &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  404.      &lt;span class=&quot;n&quot;&gt;mc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  405.        &lt;span class=&quot;n&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  406. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  407.  
  408. &lt;p&gt;By the way, check out &lt;a href=&quot;https://www.haskell.org/&quot;&gt;haskell.org&lt;/a&gt; and notice how the logo is similar to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt;.
  409. Those folks take &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;gt;&amp;gt;=&lt;/code&gt; seriously.&lt;/p&gt;
  410.  
  411. &lt;p&gt;Now let’s see how the same code looks like when using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;do&lt;/code&gt; notation, which is
  412. syntax sugar over &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt;:&lt;/p&gt;
  413.  
  414. &lt;div class=&quot;language-haskell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ma&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;do&lt;/span&gt;
  415.  &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ma&lt;/span&gt;
  416.  &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt;
  417.  &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mc&lt;/span&gt;
  418.  &lt;span class=&quot;n&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  419. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  420.  
  421. &lt;p&gt;The final result looks like an imperative language - one might argue that
  422. when you type &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;;&lt;/code&gt; in JS, you are, in fact, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt;ing monadic code (I first
  423. learned about this on Bartosz Milewski’s course &lt;a href=&quot;https://www.youtube.com/watch?v=PlFgKV0ZXoE&amp;amp;list=PLbgaMIhjbmEm_51-HWv9BQUXcmHYtl4sw&quot;&gt;Parallel and Concurrent Haskell&lt;/a&gt;.&lt;/p&gt;
  424.  
  425. &lt;p&gt;The interesting thing about this snippet is that the same syntax for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;process&lt;/code&gt;
  426. will work for other data types beyond &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt;. The only requirements for that
  427. are:&lt;/p&gt;
  428.  
  429. &lt;ul&gt;
  430.  &lt;li&gt;the data type (the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m&lt;/code&gt; in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m a&lt;/code&gt;) “knows” how to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt; (and to do that, the
  431. data type needs to have some sort of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;join&lt;/code&gt;);&lt;/li&gt;
  432.  &lt;li&gt;the data type “knows” how to wrap a value with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;return&lt;/code&gt;/&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pure&lt;/code&gt;.&lt;/li&gt;
  433. &lt;/ul&gt;
  434.  
  435. &lt;p&gt;And as we are performing a sum, there’s an additional requirement:&lt;/p&gt;
  436.  
  437. &lt;ul&gt;
  438.  &lt;li&gt;the contained data (the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m a&lt;/code&gt;) “knows” what to do with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;+&lt;/code&gt;.&lt;/li&gt;
  439. &lt;/ul&gt;
  440.  
  441. &lt;p&gt;As we are using JavaScript, this means that we can use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;+&lt;/code&gt; to sum numbers or concatenate strings.
  442. In the FP world, a function “knowing” what to do with a given type is handled
  443. by type classes, let’s focus on this monadic stuff for now.&lt;/p&gt;
  444.  
  445. &lt;h1 id=&quot;why-monads&quot;&gt;Why Monads?&lt;/h1&gt;
  446.  
  447. &lt;p&gt;So this is the power that this abstraction provides: all the data plumbing and
  448. small details handling are abstracted with the same interface for multiple
  449. data types.&lt;/p&gt;
  450.  
  451. &lt;p&gt;In Haskell, here’s how the definitions would look like (with types included):&lt;/p&gt;
  452.  
  453. &lt;div class=&quot;language-haskell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Monad&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Num&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;
  454. &lt;span class=&quot;n&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ma&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;do&lt;/span&gt;
  455.  &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ma&lt;/span&gt;
  456.  &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt;
  457.  &lt;span class=&quot;n&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  458.  
  459. &lt;span class=&quot;c1&quot;&gt;-- Adding things that might not exist&lt;/span&gt;
  460. &lt;span class=&quot;n&quot;&gt;processMaybe&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Maybe&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Integer&lt;/span&gt;
  461. &lt;span class=&quot;n&quot;&gt;processMaybe&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Just&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Nothing&lt;/span&gt;
  462.  
  463. &lt;span class=&quot;c1&quot;&gt;-- Adding values that might contain errors&lt;/span&gt;
  464. &lt;span class=&quot;n&quot;&gt;processEither&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Either&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Integer&lt;/span&gt;
  465. &lt;span class=&quot;n&quot;&gt;processEither&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Right&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Left&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;an error happened!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  466.  
  467. &lt;span class=&quot;c1&quot;&gt;-- Adding something that was obtained with a side effect&lt;/span&gt;
  468. &lt;span class=&quot;n&quot;&gt;processIO&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;do&lt;/span&gt;
  469. &lt;span class=&quot;n&quot;&gt;ma&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getStuffFromDB&lt;/span&gt;
  470. &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getStuffFromFile&lt;/span&gt;
  471. &lt;span class=&quot;n&quot;&gt;process&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ma&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt;
  472. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  473.  
  474. &lt;p&gt;If your function only cares about adding two numbers, let it just sum.
  475. The error handling can be pushed outside of sum, allowing you to
  476. separate the pure and impure parts of your code even further.&lt;/p&gt;
  477.  
  478. &lt;p&gt;This abstraction also helps to test (free monads), as you can replace the monad
  479. you are using during testing (like stubbing or mocking in some testing
  480. frameworks). This allows you to replace functions that obtain their values with
  481. side effects with an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Either&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt;, among others.&lt;/p&gt;
  482.  
  483. &lt;p&gt;Monads also make refactoring easier - if you have a set of functions that handle
  484. some &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt;s using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt;, you should be able to replace the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Maybe&lt;/code&gt;s with
  485. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Either&lt;/code&gt;, if the necessity arises.&lt;/p&gt;
  486.  
  487. &lt;p&gt;And that’s it! I tried to keep it short, but the topic is quite extensive and I
  488. barely scratched the surface. Now even if you don’t end up using monads, you
  489. better know what kind of problems they help solve.&lt;/p&gt;
  490.  
  491. &lt;p&gt;Now go ahead and write your monad tutorial as well!&lt;/p&gt;
  492.  
  493. &lt;h2 id=&quot;other-resources&quot;&gt;Other resources&lt;/h2&gt;
  494.  
  495. &lt;p&gt;If you are interested in using these data structures in JavaScript, alongside
  496. their useful &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bind&lt;/code&gt; functions, some libraries that can help you:&lt;/p&gt;
  497.  
  498. &lt;ul&gt;
  499.  &lt;li&gt;&lt;a href=&quot;https://github.com/gcanti/fp-ts&quot;&gt;fp-ts&lt;/a&gt; - implementations of various data
  500. types and utility functions in TypeScript;&lt;/li&gt;
  501.  &lt;li&gt;&lt;a href=&quot;https://github.com/sanctuary-js/sanctuary&quot;&gt;Sanctuary&lt;/a&gt; - provides utilities
  502. for writing code in an FP style and some type checking while in development mode;&lt;/li&gt;
  503.  &lt;li&gt;&lt;a href=&quot;https://www.purescript.org/&quot;&gt;PureScript&lt;/a&gt; - not JavaScript, but it compiles to
  504. JS and can easily interface with existing JS code.&lt;/li&gt;
  505. &lt;/ul&gt;
  506. </description>
  507.    </item>
  508.    
  509.    
  510.    
  511.    <item>
  512.      <title>Scheduled Tasks with ECS and Step Functions</title>
  513.      <link>https://tech.nextroll.com/blog/dev/2022/08/03/step-functions.html</link>
  514.      <pubDate>Wed, 03 Aug 2022 00:00:00 -0700</pubDate>
  515.      <author></author>
  516.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2022/08/03/step-functions</guid>
  517.      <description>&lt;h1 id=&quot;background&quot;&gt;Background&lt;/h1&gt;
  518.  
  519. &lt;p&gt;Amazon Web Services allows you to &lt;a href=&quot;https://docs.aws.amazon.com/AmazonECS/latest/developerguide/scheduling_tasks.html&quot;&gt;schedule tasks&lt;/a&gt; using ECS and CloudWatch, and for noncritical jobs, it is enough to run these regularly. It is simple, and we can manage many scheduled tasks with code using Terraform without the overhead and complexity of cloud orchestration systems such as Kubernetes.&lt;/p&gt;
  520.  
  521. &lt;p&gt;But there’s a caveat to task scheduling with ECS + CloudWatch. The duo is missing two essential features for our workflow: task locks and timeouts. This post will cover how we leveraged two other AWS services, Step Functions and DynamoDB, to add these missing features.&lt;/p&gt;
  522.  
  523. &lt;h1 id=&quot;the-problem&quot;&gt;The Problem&lt;/h1&gt;
  524.  
  525. &lt;p&gt;In the Real-time Bidding (RTB) team within NextRoll, we previously managed scheduled tasks using &lt;a href=&quot;https://docs.celeryq.dev/en/stable/&quot;&gt;Celery&lt;/a&gt;, which is good in some respects, such as being flexible and integrating with Python (the language in which we wrote most of our tasks). But it also had significant drawbacks: it was hard to maintain and limited our programming language choices to Python. More importantly, our code deploys were tied to Celery deploys. That interaction caused minor but recurrent problems, such as task locks not getting released or multiple tasks of the same kind getting dispatched.&lt;/p&gt;
  526.  
  527. &lt;p&gt;Because we were going to invest a significant amount of time fixing our problems with Celery anyway, we decided to invest some time investigating alternatives and comparing. So we defined some minimum requirements for the task management system to make the scope of our search smaller:&lt;/p&gt;
  528.  
  529. &lt;ul&gt;
  530.  &lt;li&gt;
  531.    &lt;p&gt;Cron-like scheduling&lt;/p&gt;
  532.  
  533.    &lt;p&gt;The essential feature. It consists of running a task at fixed times, dates, or intervals.&lt;/p&gt;
  534.  &lt;/li&gt;
  535.  &lt;li&gt;
  536.    &lt;p&gt;Singleton behavior&lt;/p&gt;
  537.  
  538.    &lt;p&gt;We should be able to constrain task definitions so that only a single instance of it can run at any given time. This behavior helps prevent race conditions and resource starvation scenarios while producing repeatable side effects.&lt;/p&gt;
  539.  &lt;/li&gt;
  540.  &lt;li&gt;
  541.    &lt;p&gt;Timeouts&lt;/p&gt;
  542.  
  543.    &lt;p&gt;Tasks should also be constrained to only run for a limited and configurable interval. This behavior prevents runaway resource consumption and frozen tasks and helps identify bugs quicker.&lt;/p&gt;
  544.  &lt;/li&gt;
  545. &lt;/ul&gt;
  546.  
  547. &lt;h1 id=&quot;some-solutions&quot;&gt;Some Solution(s)&lt;/h1&gt;
  548.  
  549. &lt;p&gt;We found a couple of options that met our minimum requirements, but they didn’t quite fit our taste:&lt;/p&gt;
  550.  
  551. &lt;ul&gt;
  552.  &lt;li&gt;
  553.    &lt;p&gt;Code it into each of the tasks, perhaps via shared libraries.&lt;/p&gt;
  554.  
  555.    &lt;p&gt;Sometimes, timeouts and locking do not work when managed by the task itself. Funny bugs and unexpected situations are not uncommon, as experienced when attempting this solution using Python in a previous iteration.&lt;/p&gt;
  556.  
  557.    &lt;p&gt;We prefer a solution that manages the task from outside its container for reliability and ease of use. It also allows for more flexibility when choosing a programming language for a task because we don’t need to have existing libraries in the target language to manage it.&lt;/p&gt;
  558.  &lt;/li&gt;
  559.  &lt;li&gt;
  560.    &lt;p&gt;Use a cloud orchestration system such as Kubernetes or Mesos.&lt;/p&gt;
  561.  
  562.    &lt;p&gt;Orchestration systems have their complexities and management overhead. We didn’t have a Kubernetes expert in our team, for example. And other systems don’t have managed solutions in AWS. Although we prefer cloud orchestration systems to code-in-tasks, we continued to look for something simpler.&lt;/p&gt;
  563.  &lt;/li&gt;
  564. &lt;/ul&gt;
  565.  
  566. &lt;p&gt;While good in some aspects, these solutions drove us to keep looking for alternatives. One of those was ECS + CloudWatch scheduling, which we almost discarded because there is no way to achieve our minimum requirements (&lt;a href=&quot;https://github.com/aws/containers-roadmap/issues/232&quot;&gt;singleton behavior&lt;/a&gt; or &lt;a href=&quot;https://github.com/aws/containers-roadmap/issues/1313&quot;&gt;timeout behavior&lt;/a&gt;) with only ECS and CloudWatch configurations.&lt;/p&gt;
  567.  
  568. &lt;h1 id=&quot;enter-step-functions&quot;&gt;Enter Step Functions&lt;/h1&gt;
  569.  
  570. &lt;p&gt;&lt;img src=&quot;/images/post_images/step_functions/gregorydickson.png&quot; alt=&quot;gregorydickson&quot; /&gt;&lt;/p&gt;
  571.  
  572. &lt;p&gt;Looking past the discouraging GitHub issues, we found a suggestion in the comments that could perhaps solve our problem. We looked at the Step Functions service, and we found that in combination with the ECS and CloudWatch systems, this met our requirements:&lt;/p&gt;
  573.  
  574. &lt;ul&gt;
  575.  &lt;li&gt;A managed system with a low overhead configuration&lt;/li&gt;
  576.  &lt;li&gt;&lt;em&gt;Outside-the-container&lt;/em&gt; task management for singleton behavior and timeouts&lt;/li&gt;
  577.  &lt;li&gt;Bonus: integrations with other AWS services&lt;/li&gt;
  578. &lt;/ul&gt;
  579.  
  580. &lt;p&gt;Although it is a novel approach, documentation is plenty, and it was easy to test it out before completely committing to the solution. We used the Step Functions integration with DynamoDB to implement the singleton behavior using per-task locks and the integration with ECS tasks to define timeouts.&lt;/p&gt;
  581.  
  582. &lt;p&gt;But talk is cheap. On to the code!&lt;/p&gt;
  583.  
  584. &lt;h1 id=&quot;defining-the-step-functions&quot;&gt;Defining the Step Functions&lt;/h1&gt;
  585.  
  586. &lt;p&gt;&lt;img src=&quot;/images/post_images/step_functions/step-function.png&quot; alt=&quot;Step Function&quot; /&gt;&lt;/p&gt;
  587.  
  588. &lt;blockquote&gt;
  589.  &lt;p&gt;AWS Step Functions provides serverless orchestration for modern applications.&lt;/p&gt;
  590. &lt;/blockquote&gt;
  591.  
  592. &lt;p&gt;Step Functions are a list of steps or states (as in state machine) defined via JSON. You can define those steps using the AWS console for a visual aid useful for prototyping or, as we prefer, using Terraform and keeping our infrastructure as code.&lt;/p&gt;
  593.  
  594. &lt;p&gt;We start here intending to configure a single task with singleton behavior that times out after a configurable amount of time using Terraform. After setting this up, you should be able to trigger the task from the Step Functions section in the AWS console.&lt;/p&gt;
  595.  
  596. &lt;p&gt;Using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;aws_sfn_state_machine&lt;/code&gt; from the AWS library we define a step function that checks and adds a lock corresponding to the task. If successful, it will run the task and remove the lock after it finishes. Here’s the code for the step function itself, the “steps” go nested inside the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;States&lt;/code&gt; map as key/value pairs:&lt;/p&gt;
  597.  
  598. &lt;div class=&quot;language-terraform highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;resource&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;aws_sfn_state_machine&quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;task_lock&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  599. &lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;     &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;task_name&lt;/span&gt;
  600. &lt;span class=&quot;nx&quot;&gt;role_arn&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sfn_role&lt;/span&gt;
  601.  
  602. &lt;span class=&quot;nx&quot;&gt;definition&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;jsonencode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  603. &lt;span class=&quot;nx&quot;&gt;Comment&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Task Lock&quot;&lt;/span&gt;
  604. &lt;span class=&quot;nx&quot;&gt;StartAt&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;CheckLock&quot;&lt;/span&gt;
  605. &lt;span class=&quot;nx&quot;&gt;States&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  606. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  607.  
  608. &lt;p&gt;Now we’ll define the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CheckLock&lt;/code&gt; step. This step will add an item that corresponds to the task to DynamoDB on the condition that it doesn’t exist, and if that succeeds, we run the ECS Task (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RunTask&lt;/code&gt; step). Here’s the step definition :&lt;/p&gt;
  609.  
  610. &lt;div class=&quot;language-terraform highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nx&quot;&gt;CheckLock&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  611. &lt;span class=&quot;nx&quot;&gt;Type&lt;/span&gt;     &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Task&quot;&lt;/span&gt;
  612. &lt;span class=&quot;nx&quot;&gt;Resource&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;arn:aws:states:::dynamodb:putItem&quot;&lt;/span&gt;
  613. &lt;span class=&quot;nx&quot;&gt;Parameters&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  614. &lt;span class=&quot;nx&quot;&gt;TableName&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;table_name&lt;/span&gt;
  615. &lt;span class=&quot;nx&quot;&gt;Item&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  616. &lt;span class=&quot;nx&quot;&gt;Task&lt;/span&gt;   &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;S&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;task_name&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  617. &lt;span class=&quot;nx&quot;&gt;Locked&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;BOOL&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  618. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  619. &lt;span class=&quot;nx&quot;&gt;ConditionExpression&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;attribute_not_exists(Task)&quot;&lt;/span&gt;
  620. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  621. &lt;span class=&quot;nx&quot;&gt;Next&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;RunTask&quot;&lt;/span&gt;
  622. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  623. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  624.  
  625. &lt;p&gt;Here are the details:&lt;/p&gt;
  626.  
  627. &lt;ul&gt;
  628.  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Type&lt;/code&gt; is the &lt;a href=&quot;https://states-language.net/spec.html#state-type-table&quot;&gt;type of step&lt;/a&gt; we’re defining.&lt;/li&gt;
  629.  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Resource&lt;/code&gt; is the &lt;a href=&quot;https://docs.aws.amazon.com/step-functions/latest/dg/connect-supported-services.html&quot;&gt;AWS resource&lt;/a&gt; we’ll instantiate.&lt;/li&gt;
  630.  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Parameters&lt;/code&gt; are the parameters passed to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Resource&lt;/code&gt;, different for each kind of resource. In particular, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ConditionExpression&lt;/code&gt; tells DynamoDB to return an error if the item key is in the table already.&lt;/li&gt;
  631.  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Next&lt;/code&gt; is the task that will get run after this.&lt;/li&gt;
  632. &lt;/ul&gt;
  633.  
  634. &lt;p&gt;For the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RunTask&lt;/code&gt; step we’ll run an ECS task and wait for it to finish. After the task finishes, we’ll go to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RemoveLock&lt;/code&gt; step:&lt;/p&gt;
  635.  
  636. &lt;div class=&quot;language-terraform highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nx&quot;&gt;RunTask&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  637. &lt;span class=&quot;nx&quot;&gt;Type&lt;/span&gt;     &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Task&quot;&lt;/span&gt;
  638. &lt;span class=&quot;nx&quot;&gt;Resource&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;arn:aws:states:::ecs:runTask.sync&quot;&lt;/span&gt;
  639. &lt;span class=&quot;nx&quot;&gt;Parameters&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  640. &lt;span class=&quot;nx&quot;&gt;Cluster&lt;/span&gt;        &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;task_cluster&lt;/span&gt;
  641. &lt;span class=&quot;nx&quot;&gt;TaskDefinition&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;aws_ecs_task_definition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;arn&lt;/span&gt;
  642. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  643. &lt;span class=&quot;nx&quot;&gt;TimeoutSeconds&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;timeout_minutes&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;
  644. &lt;span class=&quot;nx&quot;&gt;Next&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;RemoveLock&quot;&lt;/span&gt;
  645. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  646. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  647.  
  648. &lt;p&gt;In the snippet above, the key sections are the resource &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;runTask.sync&lt;/code&gt;, where the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.sync&lt;/code&gt; part means the step function will wait for the ECS task to complete, and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TimeoutSeconds&lt;/code&gt;, after which the step function will terminate the ECS Task resource if it hasn’t already.&lt;/p&gt;
  649.  
  650. &lt;p&gt;Finally, we remove the lock:&lt;/p&gt;
  651.  
  652. &lt;div class=&quot;language-terraform highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nx&quot;&gt;RemoveLock&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  653. &lt;span class=&quot;nx&quot;&gt;Type&lt;/span&gt;     &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Task&quot;&lt;/span&gt;
  654. &lt;span class=&quot;nx&quot;&gt;Resource&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;arn:aws:states:::dynamodb:deleteItem&quot;&lt;/span&gt;
  655. &lt;span class=&quot;nx&quot;&gt;Parameters&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  656. &lt;span class=&quot;nx&quot;&gt;TableName&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;locking&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;
  657. &lt;span class=&quot;nx&quot;&gt;Key&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  658. &lt;span class=&quot;nx&quot;&gt;Task&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;S&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  659. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  660. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  661. &lt;span class=&quot;nx&quot;&gt;End&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;
  662. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  663. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  664. &lt;p&gt;Here, the only new parameter is the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;End&lt;/code&gt; parameter. It means no steps remain, and the step function finished successfully.&lt;/p&gt;
  665.  
  666. &lt;h1 id=&quot;next--steps&quot;&gt;Next = “Steps”&lt;/h1&gt;
  667.  
  668. &lt;p&gt;In this post, we’ve worked on a simplified version of a task-locking Step Function. It is missing a couple of important sections that will allow tasks and operations to fail gracefully, such as the &lt;a href=&quot;https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html&quot;&gt;Catch and Retry&lt;/a&gt; sections. You will also need to define the permissions for the step function role and the task scheduling using CloudWatch. But those should be easy to figure out.&lt;/p&gt;
  669.  
  670. &lt;p&gt;An additional recommendation: encapsulate these configurations in a Terraform module that allows easy reuse and shared resources (such as the DynamoDB table).&lt;/p&gt;
  671.  
  672. &lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
  673.  
  674. &lt;p&gt;Step Functions are an effective tool to integrate different services from AWS without writing code. The one we’ve defined for managing our scheduled tasks has run without issue, and its operation has been simple. We’ve found that Step Functions are a service we will keep close by in our dev toolbox.&lt;/p&gt;
  675. </description>
  676.    </item>
  677.    
  678.    
  679.    
  680.    <item>
  681.      <title>Runtime errors: Come again? Rust macros to the rescue.</title>
  682.      <link>https://tech.nextroll.com/blog/dev/2022/06/21/rust-lua.html</link>
  683.      <pubDate>Tue, 21 Jun 2022 00:00:00 -0700</pubDate>
  684.      <author></author>
  685.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2022/06/21/rust-lua</guid>
  686.      <description>&lt;p&gt;We admit that this article is blatant Rust propaganda. If you are rust-curious, this is the place for you.
  687. We want to show you how the benefits of using Rust go beyond the language itself and trickle down to your whole ecosystem.&lt;/p&gt;
  688.  
  689. &lt;p&gt;Dynamic languages can undoubtedly be fun and convenient.
  690. However, always being dynamic is not necessarily a good thing.
  691. The word on the street is that unit testing is driven by dynamic languages precisely because they do not have a static type system.
  692. We can eliminate whole classes of bugs by using a statically typed language. So, we shall use it.&lt;/p&gt;
  693.  
  694. &lt;p&gt;Rust’s metaprogramming features allow us to be dynamic while preserving stability and safety.
  695. If you are unfamiliar with Rust macros and have a macro-phobia from C++, they are not the same.
  696. You should check them out &lt;a href=&quot;https://doc.rust-lang.org/book/ch19-06-macros.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  697.  
  698. &lt;h1 id=&quot;background&quot;&gt;Background&lt;/h1&gt;
  699.  
  700. &lt;p&gt;We are using Aerospike (distributed key-value store). It has a feature where you can register and execute UDF (user-defined functions) written in Lua.
  701. The caveat is that Aerospike has some restrictions on the language, like reserved keywords, disallowing globals, and such.&lt;/p&gt;
  702.  
  703. &lt;p&gt;So even with a modern IDE and linter, you do not get proper validation until you have registered the code in your Aerospike node.
  704. You might say, “Hold on, what does Lua have to do with Rust?”.
  705. Well, if you happen to use it from a Rust application, you can write a little macro that enforces the Aerospike rules on Lua and runs the Lua interpreter to catch the relevant errors at compile time.&lt;/p&gt;
  706.  
  707. &lt;h1 id=&quot;what-we-want-to-achieve&quot;&gt;What we want to achieve&lt;/h1&gt;
  708.  
  709. &lt;p&gt;At this point, we do not even need &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.lua&lt;/code&gt; files.&lt;/p&gt;
  710.  
  711. &lt;p&gt;Here is how it would look.
  712. Notice the Lua source code inside the macro invocation.&lt;/p&gt;
  713.  
  714. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;aerospike_code_gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;define&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  715.  
  716. &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  717.    &lt;span class=&quot;nd&quot;&gt;define!&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  718.      &lt;span class=&quot;n&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;my_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  719.      &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;
  720.    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  721. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  722. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  723.  
  724. &lt;p&gt;Since &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;record&lt;/code&gt; is a reserved Aerospike identifier, we get a compilation error.&lt;/p&gt;
  725.  
  726. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;local&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reserved&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;`&lt;/span&gt;
  727.  
  728. &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tests&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.rs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;24&lt;/span&gt;
  729.  &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;
  730. &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;       &lt;span class=&quot;n&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;my_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  731.  &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;                        &lt;span class=&quot;o&quot;&gt;^^^^^^&lt;/span&gt;
  732.  &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;
  733.  
  734. &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;note&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;this&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;originates&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;macro&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;define&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;`&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Nightly&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;builds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;run&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;macro&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;backtrace&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;more&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  735. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  736.  
  737. &lt;h1 id=&quot;development&quot;&gt;Development&lt;/h1&gt;
  738.  
  739. &lt;p&gt;How can we make this work?
  740. We need to create a Rust proc macro project and define our dependencies.&lt;/p&gt;
  741.  
  742. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dependencies&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  743. &lt;span class=&quot;n&quot;&gt;quote&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;1&quot;&lt;/span&gt;
  744. &lt;span class=&quot;n&quot;&gt;proc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;macro2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;1.0&quot;&lt;/span&gt;
  745. &lt;span class=&quot;n&quot;&gt;syn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;1.0&quot;&lt;/span&gt;
  746. &lt;span class=&quot;n&quot;&gt;rlua&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;0.19.1&quot;&lt;/span&gt;
  747. &lt;span class=&quot;n&quot;&gt;luaparse&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;0.2.0&quot;&lt;/span&gt;
  748. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  749.  
  750. &lt;p&gt;We need &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rlua&lt;/code&gt; to run the interpreter through and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;luaparse&lt;/code&gt; to parse the code to enforce the Aerospike requirements.&lt;/p&gt;
  751.  
  752. &lt;p&gt;The other dependencies are the default libraries one would use to build macros.&lt;/p&gt;
  753.  
  754. &lt;p&gt;Now, it is only a matter of parsing the macro input via the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;luaparse&lt;/code&gt; crate, recursively checking the definitions, and we are done.&lt;/p&gt;
  755.  
  756. &lt;p&gt;In the actual argument of kings, Linus Torvalds said: “Talk is cheap. Show me the code”.&lt;/p&gt;
  757.  
  758. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;#[proc_macro]&lt;/span&gt;
  759. &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;define&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TokenStream&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TokenStream&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  760.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.to_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  761.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lua_err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  762.  
  763.    &lt;span class=&quot;nn&quot;&gt;Lua&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lua&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  764.        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;chunk&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lua&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  765.        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;chunk&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.exec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  766.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  767.            &lt;span class=&quot;n&quot;&gt;lua_err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Some&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.to_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
  768.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  769.    &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  770.  
  771.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Some&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lua_err&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  772.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;SynError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Span&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;call_site&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  773.            &lt;span class=&quot;nf&quot;&gt;.into_compile_error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  774.            &lt;span class=&quot;nf&quot;&gt;.into&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  775.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  776. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  777. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  778.  
  779. &lt;p&gt;This was easy enough, and it will give us the errors from the Lua interpreter.&lt;/p&gt;
  780.  
  781. &lt;p&gt;The next step involves a little more work: validating the Aerospike requirements.&lt;/p&gt;
  782.  
  783. &lt;p&gt;First, we create a function that will be called from the macro.&lt;/p&gt;
  784.  
  785. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;luaparse::parse&lt;/code&gt; is going to give us the entire AST.&lt;/p&gt;
  786.  
  787. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;validate_aerospike&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  788.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;vec!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[];&lt;/span&gt;
  789.  
  790.    &lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;luaparse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  791.        &lt;span class=&quot;nf&quot;&gt;Ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  792.            &lt;span class=&quot;nf&quot;&gt;loop_statements&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.statements&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  793.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  794.  
  795.        &lt;span class=&quot;nf&quot;&gt;Err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;panic!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;{:#}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;LError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.span&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.with_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;
  796.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  797.  
  798.    &lt;span class=&quot;n&quot;&gt;errs&lt;/span&gt;
  799. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  800. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  801.  
  802. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;loop_statements&lt;/code&gt; is where our real work begins.&lt;/p&gt;
  803.  
  804. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;loop_statements&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stmts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Statement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_global&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  805.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;statement&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stmts&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  806.        &lt;span class=&quot;nf&quot;&gt;recurse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;statement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_global&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  807.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  808. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  809. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  810.  
  811. &lt;p&gt;We need to go through the entire tree. For your sanity, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;match&lt;/code&gt; statement in this function is shortened for this example.&lt;/p&gt;
  812.  
  813. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;recurse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Statement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_global&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  814.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allow_vars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  815.  
  816.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_global&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  817.        &lt;span class=&quot;n&quot;&gt;is_global&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  818.        &lt;span class=&quot;n&quot;&gt;allow_vars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  819.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  820.  
  821.    &lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stmt&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  822.        &lt;span class=&quot;nn&quot;&gt;Statement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;FunctionDeclaration&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  823.            &lt;span class=&quot;nn&quot;&gt;FunctionDeclarationStat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Local&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;..&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  824.                &lt;span class=&quot;nf&quot;&gt;validate_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_global&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  825.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  826.  
  827.            &lt;span class=&quot;nn&quot;&gt;FunctionDeclarationStat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Nonlocal&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;..&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  828.                &lt;span class=&quot;nf&quot;&gt;validate_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_global&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  829.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  830.        &lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  831.  
  832.        &lt;span class=&quot;nn&quot;&gt;Statement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;LocalDeclaration&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ld&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  833.            &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;names&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ld&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.names.pairs&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.iter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;.0&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.clone&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  834.            &lt;span class=&quot;nf&quot;&gt;validate_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allow_vars&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  835.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  836.  
  837.        &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
  838.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  839. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  840. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  841.  
  842. &lt;p&gt;Here is where we check for the reserved Aerospike names.&lt;/p&gt;
  843.  
  844. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AEROSPIKE_NAMES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  845.    &lt;span class=&quot;s&quot;&gt;&quot;record&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  846.    &lt;span class=&quot;s&quot;&gt;&quot;map&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  847.    &lt;span class=&quot;s&quot;&gt;&quot;list&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  848.    &lt;span class=&quot;s&quot;&gt;&quot;aerospike&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  849.    &lt;span class=&quot;s&quot;&gt;&quot;bytes&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  850.    &lt;span class=&quot;s&quot;&gt;&quot;geojson&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  851.    &lt;span class=&quot;s&quot;&gt;&quot;iterator&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  852.    &lt;span class=&quot;s&quot;&gt;&quot;list&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  853.    &lt;span class=&quot;s&quot;&gt;&quot;stream&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  854. &lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  855.  
  856. &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;is_reserved&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  857.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aerospike_name&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AEROSPIKE_NAMES&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  858.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aerospike_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  859.            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  860.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  861.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  862.  
  863.    &lt;span class=&quot;k&quot;&gt;false&lt;/span&gt;
  864. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  865.  
  866. &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;validate_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allow_vars&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  867.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;param&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;names&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  868.        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;param&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.to_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  869.  
  870.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;is_reserved&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  871.            &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nd&quot;&gt;format!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  872.                &lt;span class=&quot;s&quot;&gt;&quot;aerospike reserved identifier: `{}`. consider renaming your variable&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  873.                &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;
  874.            &lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  875.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  876.  
  877.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;allow_vars&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  878.            &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nd&quot;&gt;format!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;global variables are not allowed: `{}`&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  879.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  880.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  881. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  882.  
  883. &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;validate_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FunctionBody&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_global&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  884.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;names&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.params.list.pairs&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.iter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;.0&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.clone&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  885.    &lt;span class=&quot;nf&quot;&gt;validate_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  886.    &lt;span class=&quot;nf&quot;&gt;loop_statements&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.block.statements&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_global&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  887. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  888. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  889.  
  890. &lt;p&gt;You can view a complete working example &lt;a href=&quot;https://github.com/Adroll/aerospike-code-gen&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  891.  
  892. &lt;p&gt;This tremendously improves the development process, as one does not need to run code to register it on the Aerospike cluster to see an error that should have been caught at compile time.&lt;/p&gt;
  893.  
  894. &lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
  895.  
  896. &lt;p&gt;We all know the Rust trifecta - safe, fast and concurrent.&lt;/p&gt;
  897.  
  898. &lt;p&gt;Rust allows us to write the code we want while preserving those three inherent qualities.&lt;/p&gt;
  899.  
  900. &lt;p&gt;However, there is no free lunch in this world.&lt;/p&gt;
  901.  
  902. &lt;p&gt;While there have been many improvements in compilation speed, adding more macros will exacerbate the problem if you have an extensive project.&lt;/p&gt;
  903. </description>
  904.    </item>
  905.    
  906.    
  907.    
  908.    <item>
  909.      <title>Embracing ARM</title>
  910.      <link>https://tech.nextroll.com/blog/dev/2022/05/16/embracing_arm.html</link>
  911.      <pubDate>Mon, 16 May 2022 00:00:00 -0700</pubDate>
  912.      <author></author>
  913.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2022/05/16/embracing_arm</guid>
  914.      <description>&lt;center style=&quot;margin-bottom: 30px&quot;&gt;
  915.    &lt;img alt=&quot;man hugging computer&quot; src=&quot;/images/post_heroes/embracing.jpg&quot; /&gt;&lt;br /&gt;
  916. &lt;/center&gt;
  917.  
  918. &lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
  919.  
  920. &lt;p&gt;From AWS Graviton instances to M1 MacBooks, ARM processors are everywhere these days. This blog post illustrates the experiences, challenges, and solutions encountered in the Data Science Engineering team at NextRoll as we have come to embrace the ARM ecosystem.&lt;/p&gt;
  921.  
  922. &lt;h1 id=&quot;docker--containerization&quot;&gt;Docker / Containerization&lt;/h1&gt;
  923.  
  924. &lt;p&gt;For many years, the Data Science Engineering team has used Docker to create a consistent running environment for our code: Once code is working on one developer’s laptop, it should work on their teammate’s laptops, in our CI test pipelines, and when deployed to our cloud infrastructure.&lt;/p&gt;
  925.  
  926. &lt;p&gt;Everything has worked great until recently when a developer on our team upgraded their work laptop to an M1 MacBook (which uses an ARM CPU architecture). Suddenly many of our images would not build or run on their laptop, preventing them from developing locally.&lt;/p&gt;
  927.  
  928. &lt;p&gt;While the developer could develop, e.g., remotely on an X86 machine, more team members would be upgrading their laptops over time. So as a team, we decided to make our Docker images compatible with ARM.&lt;/p&gt;
  929.  
  930. &lt;h1 id=&quot;first-approach&quot;&gt;First Approach&lt;/h1&gt;
  931.  
  932. &lt;p&gt;The first approach we tried was to use simple emulation: Docker provides an emulation layer via &lt;a href=&quot;https://wiki.qemu.org/Main_Page&quot;&gt;QEMU&lt;/a&gt;, which should allow a developer on an ARM laptop to build and run a docker image targeted for X86.&lt;/p&gt;
  933.  
  934. &lt;p&gt;There were some images where this worked (albeit at a performance hit) but other images failed to run entirely. Of particular note, in our &lt;a href=&quot;https://github.com/AdRoll/fledge_lab&quot;&gt;Fledge Lab&lt;/a&gt; project, headful Chrome would immediately segfault when run in an X86 docker container on an ARM host machine.&lt;/p&gt;
  935.  
  936. &lt;h1 id=&quot;solution-multi-arch-images&quot;&gt;Solution: Multi-Arch images&lt;/h1&gt;
  937.  
  938. &lt;p&gt;For cases where emulation failed, we went the route of multi-arch images: We would build both an X86 and ARM version for each image, leading to several challenges.&lt;/p&gt;
  939.  
  940. &lt;h2 id=&quot;differing-build-steps&quot;&gt;Differing Build Steps&lt;/h2&gt;
  941.  
  942. &lt;p&gt;Some packages or tools we installed in the image required different build steps in our Dockerfile. We found the most elegant way to address this was via &lt;a href=&quot;https://www.docker.com/blog/introduction-to-heredocs-in-dockerfiles/&quot;&gt;RUN heredocs&lt;/a&gt;: We use one RUN heredoc per package/tool we install and switch on the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$TARGETPLATFORM&lt;/code&gt; variable provided by Docker.&lt;/p&gt;
  943.  
  944. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;// Install a given tool/package
  945. RUN &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOF&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;
  946.  if [ &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$TARGETPLATFORM&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt; == &apos;linux/arm64&apos; ]
  947.  then
  948.    // Build steps for ARM
  949.  else
  950.    // Build steps for X86
  951.  fi
  952. EOF&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  953.  
  954. &lt;h2 id=&quot;code-compatibility-issues&quot;&gt;Code Compatibility Issues&lt;/h2&gt;
  955.  
  956. &lt;p&gt;We ran into several issues getting code optimized for X86 to compile and run on ARM.&lt;/p&gt;
  957.  
  958. &lt;p&gt;We had some places in our code where we used X86-specific instructions, including X86 inline assembly and Intel intrinsics. We replaced these with a mix of LLVM IR code, which is architecture agnostic, and code sections where we switch on the target platform.&lt;/p&gt;
  959.  
  960. &lt;p&gt;We also ran into a compatibility issue using epoll: On x86 machines, the epoll_event struct is expected to be packed, whereas, on ARM machines, it is not, leading to odd runtime behavior where threads would try to read data from the wrong client connection and took a while to debug. Once we diagnosed the issue, the fix was simple: We just needed to switch the struct alignment on the machine architecture.&lt;/p&gt;
  961.  
  962. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-dlang&quot; data-lang=&quot;dlang&quot;&gt;&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EpollEvent&lt;/span&gt;
  963. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  964.    &lt;span class=&quot;kt&quot;&gt;uint&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;events&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EPOLLIN&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EPOLLONESHOT&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EPOLLRDHUP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  965. &lt;span class=&quot;k&quot;&gt;version&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X86_64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  966.    &lt;span class=&quot;k&quot;&gt;align&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// User data, packed&lt;/span&gt;
  967. &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
  968.    &lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;// User data, not packed&lt;/span&gt;
  969. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  970.  
  971. &lt;h2 id=&quot;memory-model-issues&quot;&gt;Memory Model Issues&lt;/h2&gt;
  972.  
  973. &lt;p&gt;One of the most subtle issues we ran into was the difference between the X86 and ARM memory models. There is an excellent introduction to memory models and code reordering &lt;a href=&quot;https://preshing.com/20120612/an-introduction-to-lock-free-programming/&quot;&gt;here&lt;/a&gt;, but the rough idea is as follows:&lt;/p&gt;
  974.  
  975. &lt;p&gt;Code that is written in a high-level language (Dlang in our case) can be reordered in many places:&lt;/p&gt;
  976. &lt;ul&gt;
  977.  &lt;li&gt;At compile time: The compiler outputs assembly in a different order than the original code.&lt;/li&gt;
  978.  &lt;li&gt;At runtime: The CPU executes the assembly/machine instructions out of order.&lt;/li&gt;
  979. &lt;/ul&gt;
  980.  
  981. &lt;p&gt;The CPU’s memory model determines what sort of reorderings are allowed at runtime. X86 has a strong memory model that prevents many reorderings of memory access instructions. ARM has a weak memory model, which gives the CPU much more liberty to rearrange memory access instructions.&lt;/p&gt;
  982.  
  983. &lt;p&gt;In order to achieve optimal performance for our ad pricing engine, we relax the memory ordering constraints for atomic operations in our multi-threaded code in several places (e.g., in collecting statistics for the 100k ads we price per second). When targeting ARM, we needed to audit our code to ensure we were not implicitly relying on the stronger memory guarantees of X86, lest we introduce a subtle race condition. (As an interesting aside, Apple sidestepped this issue with Rosetta 2 by letting it use &lt;a href=&quot;https://twitter.com/ErrataRob/status/1331735383193903104&quot;&gt;X86 memory-ordering&lt;/a&gt;)&lt;/p&gt;
  984.  
  985. &lt;h3 id=&quot;a-concrete-example&quot;&gt;A Concrete Example&lt;/h3&gt;
  986.  
  987. &lt;p&gt;The following example shows how the different memory models of X86 and ARM can lead to different behavior at runtime.&lt;/p&gt;
  988.  
  989. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-dlang&quot; data-lang=&quot;dlang&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;core&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;atomic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  990. &lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;core&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;thread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  991.  
  992. &lt;span class=&quot;k&quot;&gt;align&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;128&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;__gshared&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mutex1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  993. &lt;span class=&quot;k&quot;&gt;align&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;128&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;__gshared&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mutex2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  994.  
  995. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;do_work&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  996. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  997.    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  998.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  999.        &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;atomicExchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mutex1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){}&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// block for mutex1&lt;/span&gt;
  1000.        &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;atomicExchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mutex2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){}&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// block for mutex2&lt;/span&gt;
  1001.        &lt;span class=&quot;n&quot;&gt;mutex2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// release mutex2&lt;/span&gt;
  1002.        &lt;span class=&quot;n&quot;&gt;mutex1&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// release mutex1&lt;/span&gt;
  1003.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1004. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1005.  
  1006. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  1007. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1008.    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thread&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Thread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;do_work&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  1009.    &lt;span class=&quot;n&quot;&gt;thread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  1010.  
  1011.    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  1012.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1013.        &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;atomicExchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mutex1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){}&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// block for mutex1&lt;/span&gt;
  1014.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;atomicLoad&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mutex2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  1015.            &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  1016.        &lt;span class=&quot;n&quot;&gt;atomicStore&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mutex1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// release mutex1&lt;/span&gt;
  1017.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1018. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1019.  
  1020. &lt;p&gt;Here a worker thread acquires two mutexes in a nested manner, while the main thread checks if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mutex1&lt;/code&gt; is free while &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mutex2&lt;/code&gt; is held, violating the nesting. When I compile and run this on a pre-M1 MacBook (X86), it runs indefinitely, whereas when I compile and run it on an M1 MacBook (ARM) it hits the assertion almost immeadiately.&lt;/p&gt;
  1021.  
  1022. &lt;p&gt;What’s going on? The issue lies in the lines&lt;/p&gt;
  1023.  
  1024. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-dlang&quot; data-lang=&quot;dlang&quot;&gt;&lt;span class=&quot;n&quot;&gt;mutex2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// release mutex2&lt;/span&gt;
  1025. &lt;span class=&quot;n&quot;&gt;mutex1&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// release mutex1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1026.  
  1027. &lt;p&gt;which compile to&lt;/p&gt;
  1028.  
  1029. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-asm&quot; data-lang=&quot;asm&quot;&gt;strb    wzr, [x10]
  1030. strb    wzr, [x8]&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1031.  
  1032. &lt;p&gt;and&lt;/p&gt;
  1033.  
  1034. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-asm&quot; data-lang=&quot;asm&quot;&gt;mov     byte ptr [rcx], 0
  1035. mov     byte ptr [rax], 0&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1036.  
  1037. &lt;p&gt;on &lt;a href=&quot;https://godbolt.org/z/s67Mc5add&quot;&gt;ARMv8&lt;/a&gt; and &lt;a href=&quot;https://godbolt.org/z/YeMoETjnM&quot;&gt;X86&lt;/a&gt; respectively.&lt;/p&gt;
  1038.  
  1039. &lt;p&gt;Even though these instructions are still in order at compile time, the ARM memory model allows these stores to be run out of order whereas the X86 memory model prevents the reordering of these stores. To fix this, we can instead use&lt;/p&gt;
  1040.  
  1041. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-dlang&quot; data-lang=&quot;dlang&quot;&gt;&lt;span class=&quot;n&quot;&gt;atomicStore&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mutex2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// release mutex2&lt;/span&gt;
  1042. &lt;span class=&quot;n&quot;&gt;atomicStore&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mutex1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// release mutex1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1043.  
  1044. &lt;p&gt;which compiles to&lt;/p&gt;
  1045.  
  1046. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-asm&quot; data-lang=&quot;asm&quot;&gt;stlrb   wzr, [x10]
  1047. stlrb   wzr, [x8]&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1048.  
  1049. &lt;p&gt;and&lt;/p&gt;
  1050.  
  1051. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-asm&quot; data-lang=&quot;asm&quot;&gt;xor     edx, edx
  1052. xchg    byte ptr [rcx], dl
  1053. xor     edx, edx
  1054. xchg    byte ptr [rax], dl&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1055.  
  1056. &lt;p&gt;on &lt;a href=&quot;https://godbolt.org/z/9sbf6YKfb&quot;&gt;ARMv8&lt;/a&gt; and &lt;a href=&quot;https://godbolt.org/z/44bdYY3KG&quot;&gt;X86&lt;/a&gt; respectively. In ARM, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;strb&lt;/code&gt; instructions are now &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stlrb&lt;/code&gt; instructions. These cannot be reordered (they each provide a one-way fence from above), so the intended nesting is guaranteed.&lt;/p&gt;
  1057.  
  1058. &lt;p&gt;Even for X86 I believe the latter code is more correct. Even though the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mov&lt;/code&gt; instructions are replaced with heavier &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;xchg&lt;/code&gt; instructions, we are expressing our intention that these lines not be rearranged in the dlang code itself and leaving it to the compiler to ensure our intentions are met. In particular, in the former code, I think the compiler could have rearranged the order of these two lines, even before the runtime ordering guarantees of X86.&lt;/p&gt;
  1059.  
  1060. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  1061.  
  1062. &lt;p&gt;While it took some work (and a fair bit of debugging), we successfully made all of our Data Science Engineering images compatible with both X86 and ARM. And along the way we got to learn some of the subtle differences between the two CPU architectures! If optimizing machine learning systems at a low level catches your interest, check out our &lt;a href=&quot;https://www.nextroll.com/careers&quot;&gt;careers page&lt;/a&gt;.&lt;/p&gt;
  1063. </description>
  1064.    </item>
  1065.    
  1066.    
  1067.    
  1068.    <item>
  1069.      <title>Introducing Rebar3 TypEr</title>
  1070.      <link>https://tech.nextroll.com/blog/dev/2022/04/27/rebar3-typer.html</link>
  1071.      <pubDate>Wed, 27 Apr 2022 00:00:00 -0700</pubDate>
  1072.      <author></author>
  1073.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2022/04/27/rebar3-typer</guid>
  1074.      <description>&lt;center style=&quot;margin-bottom: 30px&quot;&gt;
  1075.    &lt;img alt=&quot;a typewriter&quot; src=&quot;/images/post_heroes/typewriter.jpg&quot; /&gt;&lt;br /&gt;
  1076.    &lt;i&gt;Introducing Rebar3 TypEr&lt;/i&gt;
  1077. &lt;/center&gt;
  1078.  
  1079. &lt;p&gt;Do you have an Erlang codebase that’s a little short on type specs? Maybe you know &lt;a href=&quot;https://www.erlang.org/doc/man/typer.html&quot;&gt;TypEr&lt;/a&gt; exists, but wouldn’t it be nice to:&lt;/p&gt;
  1080.  
  1081. &lt;ul&gt;
  1082.  &lt;li&gt;incorporate it into your build system?&lt;/li&gt;
  1083.  &lt;li&gt;configure it only once?&lt;/li&gt;
  1084.  &lt;li&gt;have it ✨ magically✨ figure out all the command line arguments for you?&lt;/li&gt;
  1085. &lt;/ul&gt;
  1086.  
  1087. &lt;h2 id=&quot;good-news&quot;&gt;Good news&lt;/h2&gt;
  1088.  
  1089. &lt;p&gt;We built that! It’s called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3_typer&lt;/code&gt;, and it (as the name implies) plugs into rebar3. With that rebar3 integration comes understanding the rest of your configuration. It’s &lt;a href=&quot;https://hex.pm/packages/rebar3_typer&quot;&gt;available on hex.pm&lt;/a&gt;.&lt;/p&gt;
  1090.  
  1091. &lt;p&gt;One of the first things we had to do was refactor the original upstream TypEr a bit. TypEr was always intended to be run from the command line. It didn’t have (or need) an API available from within the BEAM. Well, until now. We needed that. So, we split out a “core” module &lt;a href=&quot;https://github.com/erlang/otp/pull/5660&quot;&gt;and sent that upstream&lt;/a&gt;. That’ll be in OTP 25.&lt;/p&gt;
  1092.  
  1093. &lt;p&gt;On that note, &lt;em&gt;contributions welcome&lt;/em&gt;. You can &lt;a href=&quot;https://github.com/AdRoll/rebar3_typer/issues&quot;&gt;file bugs with us&lt;/a&gt;, &lt;a href=&quot;https://github.com/AdRoll/rebar3_typer/pulls&quot;&gt;submit improvements to our repo&lt;/a&gt; (improvements to the hexdocs especially welcome), or &lt;a href=&quot;https://github.com/erlang/otp&quot;&gt;help out upstream&lt;/a&gt;.&lt;/p&gt;
  1094.  
  1095. &lt;p&gt;We have to thank Viacheslav Katsuba for his &lt;a href=&quot;https://github.com/vkatsuba/rebar3_plugin&quot;&gt;rebar3_plugin template&lt;/a&gt;. It was very helpful for getting us a framework to start from so we could dive in. And of course, thanks to &lt;a href=&quot;https://github.com/kostis&quot;&gt;Kostis Sagonas&lt;/a&gt; for all his work on Dialyzer and TypEr.&lt;/p&gt;
  1096.  
  1097. &lt;h2 id=&quot;how-to-use-rebar3_typer&quot;&gt;How to use rebar3_typer&lt;/h2&gt;
  1098.  
  1099. &lt;p&gt;For a quick up-and-running, just add this to your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; (either in your project or globally in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/.config/rebar3/rebar.config&lt;/code&gt;):&lt;/p&gt;
  1100.  
  1101. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plugins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rebar3_typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1102.  
  1103. &lt;p&gt;Then run…&lt;/p&gt;
  1104.  
  1105. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 dialyzer
  1106. rebar3 typer&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1107.  
  1108. &lt;p&gt;…and wait.&lt;/p&gt;
  1109.  
  1110. &lt;p&gt;Without further configuration, it will:&lt;/p&gt;
  1111.  
  1112. &lt;ul&gt;
  1113.  &lt;li&gt;look for source files in the usual places&lt;/li&gt;
  1114.  &lt;li&gt;look for source files wherever they’ve been configured in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; for the project in general&lt;/li&gt;
  1115.  &lt;li&gt;look for Dialyzer’s PLT file in the usual place&lt;/li&gt;
  1116.  &lt;li&gt;look for Dialyzer’s PLT file wherever it’s been configured in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; for the project in general&lt;/li&gt;
  1117.  &lt;li&gt;emit a list of all the specs you can add to those modules, based on the types inferred by Dialyzer&lt;/li&gt;
  1118. &lt;/ul&gt;
  1119.  
  1120. &lt;p&gt;&lt;em&gt;Or&lt;/em&gt; if you want to take advantage of our enhancement that automatically annotates your source code directly for you, try:&lt;/p&gt;
  1121.  
  1122. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 dialyzer
  1123. rebar3 typer &lt;span class=&quot;nt&quot;&gt;--annotate-in-place&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1124.  
  1125. &lt;p&gt;(Yes, this new feature has been &lt;a href=&quot;https://github.com/erlang/otp/pull/5802&quot;&gt;sent upstream&lt;/a&gt;.)&lt;/p&gt;
  1126.  
  1127. &lt;h2 id=&quot;options&quot;&gt;Options&lt;/h2&gt;
  1128.  
  1129. &lt;p&gt;All options are available on the command line or in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; file. The command-line options always override the options in the config file.&lt;/p&gt;
  1130.  
  1131. &lt;p&gt;The basic config starts out:&lt;/p&gt;
  1132.  
  1133. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1134.  
  1135. &lt;h3 id=&quot;choose-the-files&quot;&gt;Choose the files&lt;/h3&gt;
  1136.  
  1137. &lt;p&gt;If source file auto-detection doesn’t work for your situation, you’ll need to specify either a list of files (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-f&lt;/code&gt;) or a list of directories (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-r&lt;/code&gt;) (or both!) to be analyzed.&lt;/p&gt;
  1138.  
  1139. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; d1,d2 &lt;span class=&quot;nt&quot;&gt;-f&lt;/span&gt; foo.erl&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1140.  
  1141. &lt;p&gt;Or&lt;/p&gt;
  1142.  
  1143. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1144.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recursive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;d1/&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;d2/&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
  1145.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;foo.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  1146. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1147.  
  1148. &lt;h3 id=&quot;provide-the-plt&quot;&gt;Provide the PLT&lt;/h3&gt;
  1149.  
  1150. &lt;p&gt;If &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3_typer&lt;/code&gt; can’t find your PLT file (and you’re sure it’s already been generated), you can give it some hints.&lt;/p&gt;
  1151.  
  1152. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;--plt&lt;/span&gt; myfile.plt&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1153.  
  1154. &lt;p&gt;Or&lt;/p&gt;
  1155.  
  1156. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1157.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/path/to/plt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1158. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1159.  
  1160. &lt;h3 id=&quot;show-or-annotate&quot;&gt;Show or annotate&lt;/h3&gt;
  1161.  
  1162. &lt;p&gt;Do you want it to only print the type specs to stdout, or do you want it to put its output in files?&lt;/p&gt;
  1163.  
  1164. &lt;p&gt;If you want them printed to stdout, do you want it for all functions?&lt;/p&gt;
  1165.  
  1166. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;--show&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1167.  
  1168. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1169.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1170. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1171.  
  1172. &lt;p&gt;Or maybe only for exported (public) functions?&lt;/p&gt;
  1173.  
  1174. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;--show-exported&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1175.  
  1176. &lt;p&gt;(&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--show_exported&lt;/code&gt; is also accepted)&lt;/p&gt;
  1177.  
  1178. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1179.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show_exported&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1180. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1181.  
  1182. &lt;p&gt;If you’d like the results to go to files, use an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;annotate&lt;/code&gt; option instead of a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;show&lt;/code&gt; option.&lt;/p&gt;
  1183.  
  1184. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;--annotate&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1185.  
  1186. &lt;p&gt;Or&lt;/p&gt;
  1187.  
  1188. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1189.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;annotate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1190. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1191.  
  1192. &lt;p&gt;Do you want to annotate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;include()&lt;/code&gt; files? (Warning, though: there is an &lt;a href=&quot;https://github.com/erlang/otp/issues/5653&quot;&gt;open bug&lt;/a&gt; upstream for this feature.)&lt;/p&gt;
  1193.  
  1194. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;--annotate-inc-files&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1195.  
  1196. &lt;p&gt;(&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--annotate_inc_files&lt;/code&gt; is also accepted)&lt;/p&gt;
  1197.  
  1198. &lt;p&gt;Or&lt;/p&gt;
  1199.  
  1200. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1201.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;annotate_inc_files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1202. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1203.  
  1204. &lt;p&gt;By default, annotations go into separate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.ann.erl&lt;/code&gt; files in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;typer_ann/&lt;/code&gt; directory. Would you like it to instead put the annotations directly into your code for you? Check out &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--annotate-in-place&lt;/code&gt;&lt;/p&gt;
  1205.  
  1206. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;--annotate-in-place&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1207.  
  1208. &lt;p&gt;(&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--annotate_in_place&lt;/code&gt; is also accepted)&lt;/p&gt;
  1209.  
  1210. &lt;p&gt;Or&lt;/p&gt;
  1211.  
  1212. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1213.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;annotate_in_place&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1214. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1215.  
  1216. &lt;p&gt;By default, only functions with missing specs will be addressed, in any of these modes.&lt;/p&gt;
  1217.  
  1218. &lt;h3 id=&quot;what-format-to-use&quot;&gt;What format to use&lt;/h3&gt;
  1219.  
  1220. &lt;p&gt;By default, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3_typer&lt;/code&gt; will use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-spec&lt;/code&gt; syntax. If you’d prefer Edoc syntax (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@spec&lt;/code&gt;), you can specify that:&lt;/p&gt;
  1221.  
  1222. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;--edoc&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1223.  
  1224. &lt;p&gt;Or&lt;/p&gt;
  1225.  
  1226. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1227.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;edoc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1228. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1229.  
  1230. &lt;h3 id=&quot;how-to-handle-pre-existing-type-specs&quot;&gt;How to handle pre-existing type specs&lt;/h3&gt;
  1231.  
  1232. &lt;p&gt;You have some options for what to do when it finds type specs already in place. If you have particular files with specs, and you want &lt;em&gt;those&lt;/em&gt; specs to serve as the basis for additional type specs:&lt;/p&gt;
  1233.  
  1234. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;-T&lt;/span&gt; f1,f2&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1235.  
  1236. &lt;p&gt;Or&lt;/p&gt;
  1237.  
  1238. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1239.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typespec_files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;f1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;f2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  1240. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1241.  
  1242. &lt;p&gt;(This accepts file names or directory names. But another warning: this &lt;a href=&quot;https://github.com/erlang/otp/issues/5657&quot;&gt;also has an open bug upstream&lt;/a&gt;.)&lt;/p&gt;
  1243.  
  1244. &lt;p&gt;If you’d rather existing type specs be completely ignored:&lt;/p&gt;
  1245.  
  1246. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;--no_spec&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1247.  
  1248. &lt;p&gt;Or&lt;/p&gt;
  1249.  
  1250. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1251.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;no_spec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1252. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1253.  
  1254. &lt;p&gt;Finally, TypEr has an option to show Dialyzer’s success typings, which is reachable through this plugin. This is undocumented upstream.&lt;/p&gt;
  1255.  
  1256. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 typer &lt;span class=&quot;nt&quot;&gt;--show_success_typings&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1257.  
  1258. &lt;p&gt;Or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--show-success-typings&lt;/code&gt;.&lt;/p&gt;
  1259.  
  1260. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  1261.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;show_succ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1262. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1263.  
  1264. &lt;hr /&gt;
  1265.  
  1266. &lt;p&gt;This project was part of Hack Week, the twice-yearly event where employees get to decide what we build. We learn things, try things out, and get to work with different folks than usual.  You can &lt;a href=&quot;https://tech.nextroll.com/blog/culture/2019/11/26/hackweek-at-nextroll.html&quot;&gt;read more about Hack Week&lt;/a&gt;.&lt;/p&gt;
  1267.  
  1268. </description>
  1269.    </item>
  1270.    
  1271.    
  1272.    
  1273.    <item>
  1274.      <title>Rustenstein 3D: Game programming like it&apos;s 1992</title>
  1275.      <link>https://tech.nextroll.com/blog/dev/2022/02/02/rustenstein.html</link>
  1276.      <pubDate>Wed, 02 Feb 2022 00:00:00 -0800</pubDate>
  1277.      <author></author>
  1278.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2022/02/02/rustenstein</guid>
  1279.      <description>&lt;p&gt;Twice a year, NextRoll celebrates Hack Week, where employees get to work for a week on a project of their choice. It’s an excellent opportunity to experiment, learn new technologies and team up with people from across the company.  You can learn all about Hack Week &lt;a href=&quot;https://tech.nextroll.com/blog/culture/2019/11/26/hackweek-at-nextroll.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  1280.  
  1281. &lt;p&gt;As NextRoll increasingly adopts the Rust programming language, it’s common for engineers to use Hack Week as an opportunity to gain experience with it. Another popular choice is to work on video games and, as you may have guessed, we frequently see them combined in Rust video game projects. Last year, a group of us worked on extending my &lt;a href=&quot;https://github.com/facundoolano/rpg-cli/&quot;&gt;rpg-cli game&lt;/a&gt;.  This time, though, we wanted to step it up a notch with a project that would exercise some of Rust’s strengths: low-level programming, intense computations and C data interoperability. And so we decided to port the classic &lt;a href=&quot;https://en.wikipedia.org/wiki/Wolfenstein_3D&quot;&gt;Wolfenstein 3D game&lt;/a&gt; to Rust.&lt;/p&gt;
  1282.  
  1283. &lt;h1 id=&quot;background&quot;&gt;Background&lt;/h1&gt;
  1284.  
  1285. &lt;p&gt;id Software was famous for pushing the envelope of PC game programming: first by implementing NES-like side-scrollers on hardware that wasn’t prepared for it, then practically inventing and dominating the 3D first-person shooter genre, then making network and internet multiplayer a reality. Along the way, they also popularized the shareware distribution method, encouraged community modding and open-sourced all of their hit titles. &lt;a href=&quot;https://en.wikipedia.org/wiki/Masters_of_Doom&quot;&gt;Masters of Doom&lt;/a&gt; by David Kushner tells the story; Fabien Sanglard’s &lt;a href=&quot;https://fabiensanglard.net/gebb/index.html&quot;&gt;Game Engine black books&lt;/a&gt; explains the technical details.&lt;/p&gt;
  1286.  
  1287. &lt;center&gt;
  1288.    &lt;img alt=&quot;The Game Engine Black Book&quot; src=&quot;/images/post_images/rustestein/book.png&quot; style=&quot;width: 50%&quot; /&gt;
  1289.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1290. &lt;/center&gt;
  1291.  
  1292. &lt;p&gt;Perhaps less notorious than its successors Doom and Quake, Wolfenstein 3D is a big milestone in id Software’s evolution and PC gaming in general. In addition, because its technology is more primitive, the source code is more approachable for study and implementation. The game doesn’t have a real 3D engine but rather simulates a 3D world from a 2D map using a technique called &lt;em&gt;Ray Casting&lt;/em&gt;. All the drawing is done by directly putting pixels on the screen.&lt;/p&gt;
  1293.  
  1294. &lt;p&gt;A few years ago, after reading the Wolfenstein black book, I spent some time trying to &lt;a href=&quot;https://github.com/facundoolano/wolf4py&quot;&gt;port it to Python&lt;/a&gt;, based on another modern port, &lt;a href=&quot;https://github.com/facundoolano/wolf4sdl&quot;&gt;wolf4sdl&lt;/a&gt;. I tried to remain as close as possible to the original source, which proved to be very difficult, so I eventually dropped the project. More recently, &lt;a href=&quot;https://github.com/Oppen&quot;&gt;Mario Rugiero&lt;/a&gt;, who also read the book, proposed a Rust port as a project for this Hack Week. Several people jumped in, and so did I; although, based on my previous experience, the enterprise still seemed daunting: some of us were new to Rust, some had never played Wolf, some hadn’t read the book yet, and none had implemented ray casting before. We started without much hope of having something to show by the end of the week, but we saw enormous learning opportunities, so we dove right in.&lt;/p&gt;
  1295.  
  1296. &lt;h1 id=&quot;development&quot;&gt;Development&lt;/h1&gt;
  1297.  
  1298. &lt;p&gt;We roughly identified some components of the game that could be tackled separately, so each member picked one and went to work on it:&lt;/p&gt;
  1299.  
  1300. &lt;ul&gt;
  1301.  &lt;li&gt;graphic files decompression and parsing&lt;/li&gt;
  1302.  &lt;li&gt;map files decompression, parsing and interpretation&lt;/li&gt;
  1303.  &lt;li&gt;SDL graphic manipulation and texture rendering&lt;/li&gt;
  1304.  &lt;li&gt;ray casting&lt;/li&gt;
  1305.  &lt;li&gt;game loop and input management&lt;/li&gt;
  1306.  &lt;li&gt;world rendering&lt;/li&gt;
  1307. &lt;/ul&gt;
  1308.  
  1309. &lt;p&gt;In the cases where the output of one component was required as input for the next, we used pre-parsed or hard-coded data, extracted from the reference wolf4py and wolf4sdl implementations: decompressed binary dumps of assets, hardcoded maps and walls, etc. This allowed us to make progress in parallel.&lt;/p&gt;
  1310.  
  1311. &lt;h2 id=&quot;assets&quot;&gt;Assets&lt;/h2&gt;
  1312. &lt;p&gt;The first task of porting the game is to read its data. Wolfenstein ships with a set of files for its different assets: graphics (images, textures and sprites), audio (music and sound effects) and maps. One of the complications is that each version of the game has slightly different files, with different offsets and, in some cases, using different compression methods. For Rustenstein, we used the .WL1 files of the shareware version, which we &lt;a href=&quot;https://github.com/AdRoll/rustenstein/tree/main/shareware&quot;&gt;included in the repository&lt;/a&gt;.&lt;/p&gt;
  1313.  
  1314. &lt;p&gt;Each file uses a different combination of several decompression algorithms, all of which we had to port to Rust:&lt;/p&gt;
  1315. &lt;ul&gt;
  1316.  &lt;li&gt;the traditional &lt;a href=&quot;https://moddingwiki.shikadi.net/wiki/Huffman_Compression&quot;&gt;Huffman compression&lt;/a&gt;&lt;/li&gt;
  1317.  &lt;li&gt;&lt;a href=&quot;https://moddingwiki.shikadi.net/wiki/Id_Software_RLEW_compression&quot;&gt;RLEW compression&lt;/a&gt;, a run-length encoding algorithm that works at the word level&lt;/li&gt;
  1318.  &lt;li&gt;a &lt;a href=&quot;https://moddingwiki.shikadi.net/wiki/Carmack_compression&quot;&gt;“Carmack”&lt;/a&gt; compression, which is John Carmack’s variant of the LZ (Lempel-Ziv) method. According to the Black Book, without much access to the literature, Carmack would “invent” an algorithm to later find out that someone else had done it before.&lt;/li&gt;
  1319. &lt;/ul&gt;
  1320.  
  1321. &lt;p&gt;The original Wolf engine has a Memory Manager component to handle memory allocation and compacting (instead of the traditional C &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;malloc&lt;/code&gt;) as well as a Page Manager to move assets from disk to RAM. Both components are unnecessary in modern hardware as we can safely assume that we can fit all assets in memory, so we did not include them in our port.&lt;/p&gt;
  1322.  
  1323. &lt;p&gt;Parsing and decompression code can be found &lt;a href=&quot;https://github.com/AdRoll/rustenstein/blob/3c4452b38dad2ba0f5f3d2c07209b89bd61e50c2/src/map_parser.rs&quot;&gt;here&lt;/a&gt; for maps, and &lt;a href=&quot;https://github.com/AdRoll/rustenstein/blob/3c4452b38dad2ba0f5f3d2c07209b89bd61e50c2/src/cache.rs&quot;&gt;here&lt;/a&gt; for the rest of the assets.&lt;/p&gt;
  1324.  
  1325. &lt;h2 id=&quot;maps&quot;&gt;Maps&lt;/h2&gt;
  1326. &lt;p&gt;Wolfenstein 3D maps are defined as 64x64 grids of tiles. Each map has two layers of tiles: one for walls and doors, and another to place the player, enemies, and bonus items. The different tile values determine what texture to render on walls, what locks are required for doors, what direction the player is facing, etc. All walls have the same height and since they are represented as blocks in the tile grid, all intersections are rectangular; while this constrains the level designs, it dramatically simplifies the ray casting algorithm for drawing the 3D world.&lt;/p&gt;
  1327.  
  1328. &lt;p&gt;Below is the first map of the first episode, as seen in a Wolfenstein map editor:&lt;/p&gt;
  1329.  
  1330. &lt;center&gt;
  1331.    &lt;img alt=&quot;E1M1 viewed on an editor&quot; src=&quot;/images/post_images/rustestein/map1.png&quot; style=&quot;width: 50%&quot; /&gt;
  1332.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1333. &lt;/center&gt;
  1334.  
  1335. &lt;p&gt;And the same map as ASCII, as printed by our debugging code:&lt;/p&gt;
  1336.  
  1337. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1338. WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1339. WWWWWWWWWWWWWWWWWWWWWWWWWWWWW           WWWWWWWWWWWWWWWWWWWWWWWW
  1340. WWWWWW    WWWWWWWWWWWWWWWWWWW           WWWWWWWWWWWWWWWWWWWWWWWW
  1341. WWWWWW    WWWWWWWW          W           W   WWWWWWWWWWWWWWWWWWWW
  1342. WWWWWW     WWWWWWW          |           | W WWWWWWWWWWWWWWWWWWWW
  1343. WWWWWW     WWWWWWW          W           W   WWWWWWWWWWWWWWWWWWWW
  1344. WWWWWWWWWWWWWWWWWW   WWWWWWWW           WWWWWWWWWWWWWWWWWWWWWWWW
  1345. WWWWWW         WWW   WWWWWWWW           WWWWWWWWWWWWWWWWWWWWWWWW
  1346. WWWWWW         WWW   WWWWWWWWWWWWW-WWWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1347. WWWWWW         W       WWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1348. WWWWWW         |       WWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1349. WWWWWW         W       WWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1350. WWWWWW         WWWWWWWWWWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1351. W   WW         WWWWWWWWWWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1352. W   WWWWWW-WWWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1353. W   WWWWW   WWWWWWWWWWWWWWWW  WW      WWWWWWWWWWWWWWWWWWWWWWWWWW
  1354. WW-WWWWWW   WWWWWWWWWWWWWWWW  WWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1355. W   WWWWW   WWWWWWWWWWWWWWWW  WWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1356. W   W   W   WWWWWWWWWWWWWWWW  WWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1357. W       W   WWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1358. W   W   W   WWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1359. W   WWWWWW-WWWWWWWWWWWWWWWWWWWWWWW-WWWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1360. W   WW         WWWWWWWWWWWWWWWWW     WWWWWWWWWWWWWWWWW        WW
  1361. W   WW         WWWWWWWWWWWW               WWWWWWWWWWWW        WW
  1362. W   WW         WWWWWWWWWWWW               WWWWWWWWWWWW        WW
  1363. W    W         WWWWWWWWWWWW                W         W        WW
  1364. W    |         WWWWWWWWWWWW                |         |        WW
  1365. W    W         WWWWWWWWWWWW                W         W        WW
  1366. W   WW         WWWWWWWWWWWW               WWWW    WWWW        WW
  1367. W   WW         WWWWWWWWWWWW               WWWWW  WWWWW        WW
  1368. W   WW         WWWWWWWWWWWWWWWWW     WWWWWWWWWW  WWWWW        WW
  1369. W   WWWWWW-WWWWWWWWWWWWWWWWWWWWWWW-WWWWWWWWWWWW  WWWWWWW WW WWWW
  1370. W   WWWWW   WWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWW  WWWWWWWWWWWWWWW
  1371. W   WWWWW   WWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWW  WWWWWWWWWWWWWWW
  1372. W   W   W   WWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWW  WWWWWWWWWWWWWWW
  1373. W       W   WWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWW  WWW W W W WWWWW
  1374. W   W   W   WWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWW   W         WWWW
  1375. W   WWWWW   WWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWW   |         WWWW
  1376. W   WWWWW   WWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWW   W         WWWW
  1377. W                W      WWWWWWWWW   WWWWWWWWWWWWWWWW W W W WWWWW
  1378. W                |      W WWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1379. W                W     WWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1380. WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1381. WWWWWWWWWWWW  W  W  WWWWWWWWWWWWWW-WWWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1382. WWWWWWWWWW W  W  W WWWWWWWWW    W   W    WWWWWWWWWWWWWWWWWWWWWWW
  1383. WWWWWWWWWWWW     WWWWWWWWWWW    |   |    WWWWWWWWWWWWWWWWWWWWWWW
  1384. WWWWWWWWWWWWWWWWWWWWWWWWWWWW    W   W    WWWWWWWWWWWWWWWWWWWWWWW
  1385. WWWWWWWWWWWWW  WWWWWWWWWWWWW    W   W    WWWWWWWWWWWWWWWWWWWWWWW
  1386. WWWWWWWWWWWWW  WWWWWWWWWWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1387. WWWWWWWWWWWWWWWWWWWWWWWWWWWW    W   W    WWWWWWWWWWWWWWWWWWWWWWW
  1388. WWWWWWWWWWWWWWWWWWWWWWWWWWWW P  |   |    WWWWWWWWWWWWWWWWWWWWWWW
  1389. WWWWWWWWWWWWWWWWWWWWWWWWWWWW    W   W    WWWWWWWWWWWWWWWWWWWWWWW
  1390. WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW   WWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1391. WWWWWWWWWWWWWWWWWWWWWWWWWWWW             WWWWWWWWWWWWWWWWWWWWWWW
  1392. WWWWWWWWWWWWWWWWWWWWWWWWWWWW             WWWWWWWWWWWWWWWWWWWWWWW
  1393. WWWWWWWWWWWWWWWWWWWWWWWWWWWW             WWWWWWWWWWWWWWWWWWWWWWW
  1394. WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
  1395. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  1396.  
  1397. &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
  1398.  
  1399. &lt;h2 id=&quot;pixel-drawing&quot;&gt;Pixel drawing&lt;/h2&gt;
  1400. &lt;p&gt;For the graphic assets, decompressing and loading the data to memory is just half of the story. The binary chunks that make each graphic (image, sprite or texture) are arranged explicitly for fast rendering on the VGA displays the game was initially designed for. This means that the graphics are rotated to be drawn in columns, and the columns themselves appear interleaved in the file since VGA allowed for parallel writing to four different video memory banks.&lt;/p&gt;
  1401.  
  1402. &lt;p&gt;Each byte in the graphic binary chunks is an index to the 256 color palette used in Wolfenstein 3D. The reference wolf4sdl implementation would write those chunks to an SDL surface, which would, in turn, be translated to RGB colors before being copied to the screen, as described in &lt;a href=&quot;http://sandervanderburg.blogspot.com/2014/05/rendering-8-bit-palettized-surfaces-in.html&quot;&gt;this blog post&lt;/a&gt;. Since the &lt;a href=&quot;https://github.com/Rust-SDL2/rust-sdl2&quot;&gt;Rust bindings for SDL&lt;/a&gt; use a different set of abstractions (and, in particular, they don’t expose the &lt;a href=&quot;https://wiki.libsdl.org/SDL_ConvertPixels&quot;&gt;SDL_ConvertPixels&lt;/a&gt; function), we opted for converting from palette index to RGB colors on the fly, writing directly to an RGB texture that then gets copied to the renderable canvas. This means that the rendering routines need to be adapted to write red, green and blue bytes to form each pixel, instead of the single palette index byte.&lt;/p&gt;
  1403.  
  1404. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;put_pixel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pitch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1405.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1406.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pitch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1407.    &lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1408.    &lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1409.    &lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1410. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1411. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  1412.  
  1413. &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
  1414.  
  1415. &lt;p&gt;The two graphic rendering routines we implemented were directly ported from the wolf4py implementation, which in turn was ported almost line by line from the wolf4sdl reference fork. The first routine handles displaying a full image directly to the screen. This is used for the title screen as well as the player status bar at the bottom of the in-game view:&lt;/p&gt;
  1416.  
  1417. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;draw_to_texture&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;texture&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Texture&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Picture&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;color_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ColorMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1418.    &lt;span class=&quot;n&quot;&gt;texture&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.with_lock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pitch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1419.        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pic&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.height&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1420.            &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pic&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.width&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1421.                &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source_index&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
  1422.                    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pic&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.width&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pic&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.width&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pic&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.height&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1423.                &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;color&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pic&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;source_index&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  1424.                &lt;span class=&quot;nf&quot;&gt;put_pixel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pitch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;color_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]);&lt;/span&gt;
  1425.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1426.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1427.    &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  1428. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1429. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  1430.  
  1431. &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
  1432. &lt;center&gt;
  1433.    &lt;img alt=&quot;The Game Engine Black Book&quot; src=&quot;/images/post_images/rustestein/titlepic.png&quot; style=&quot;width: 100%&quot; /&gt;
  1434.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1435. &lt;/center&gt;
  1436.  
  1437. &lt;p&gt;The second routine, a much more complex one, is in charge of drawing sprites and is currently used to display the player weapon. A similar but even more complicated function is left to be ported: the one that draws scaled images such as wall textures and enemy sprites.&lt;/p&gt;
  1438.  
  1439. &lt;center&gt;
  1440.    &lt;img alt=&quot;Doge weapon&quot; src=&quot;/images/post_images/rustestein/doge.png&quot; style=&quot;width: 100%&quot; /&gt;
  1441.    &lt;i&gt;An unexpected sprite appeared instead of the weapon during development.&lt;/i&gt;
  1442.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1443. &lt;/center&gt;
  1444.  
  1445. &lt;p&gt;It would be desirable to improve this implementation such that most of the processing is done once as part of the asset loading step, and the binary chunks are kept in memory, ready to be written to the screen.&lt;/p&gt;
  1446.  
  1447. &lt;p&gt;The related code can be found &lt;a href=&quot;https://github.com/AdRoll/rustenstein/blob/3c4452b38dad2ba0f5f3d2c07209b89bd61e50c2/src/main.rs#L194-L332&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  1448.  
  1449. &lt;h2 id=&quot;ray-casting&quot;&gt;Ray Casting&lt;/h2&gt;
  1450. &lt;p&gt;At the heart of the Wolfenstein 3D engine is the Ray Casting algorithm. This routine allows us to project a 2D world (defined by the 64x64 tilemap) into a 3D view, solely based on 2D operations. The algorithm can be summarized as follows:&lt;/p&gt;
  1451.  
  1452. &lt;ol&gt;
  1453.  &lt;li&gt;Cast a ray from the player’s current position for each pixel column in the screen width. For example, the classical Wolfenstein 3D resolution is 320x200, so this means casting 320 rays to draw a frame.&lt;/li&gt;
  1454.  &lt;li&gt;Extend the ray in a direction determined by the current horizontal pixel, the player’s position and its field of view, until it hits a wall in the map. Because the walls are rectangular, the calculations to extend the rays are greatly simplified, since there’s a constant distance between a tile and the next.&lt;/li&gt;
  1455.  &lt;li&gt;Once the ray intersects a wall, calculate the distance from the player to that wall, using trigonometry.&lt;/li&gt;
  1456.  &lt;li&gt;Set a height for the wall, inversely proportional to the calculated distance. This is: the further the wall the ray hits is from the player, the smaller the wall looks from the player’s perspective (and the smaller the column of pixels we will need to draw on the screen).&lt;/li&gt;
  1457. &lt;/ol&gt;
  1458.  
  1459. &lt;center&gt;
  1460.    &lt;img alt=&quot;Sample raycast map&quot; src=&quot;/images/post_images/rustestein/raycast1.gif&quot; style=&quot;width: 50%&quot; /&gt;
  1461.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1462. &lt;/center&gt;
  1463.  
  1464. &lt;p&gt;Below is a simplified JavaScript version of the algorithm, based on &lt;a href=&quot;https://github.com/vinibiavatti1/RayCastingTutorial&quot;&gt;this tutorial&lt;/a&gt;:&lt;/p&gt;
  1465.  
  1466. &lt;div class=&quot;language-javascript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rayCasting&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;screen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;player&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1467.  &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;precision&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1468.  &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;incrementAngle&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;player&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fieldOfView&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;screen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1469.  
  1470.  &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;wallHeights&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[];&lt;/span&gt;
  1471.  &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rayAngle&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;player&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;angle&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;player&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fieldOfView&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1472.  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rayCount&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rayCount&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;screen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rayCount&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1473.  
  1474.    &lt;span class=&quot;c1&quot;&gt;// start the ray at the player position&lt;/span&gt;
  1475.    &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;ray&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1476.      &lt;span class=&quot;na&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;player&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  1477.      &lt;span class=&quot;na&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;player&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt;
  1478.    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  1479.  
  1480.    &lt;span class=&quot;c1&quot;&gt;// the ray moves at constant increments&lt;/span&gt;
  1481.    &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rayCos&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;cos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;degreeToRadians&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;rayAngle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;precision&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1482.    &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;raySin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;degreeToRadians&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;rayAngle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;precision&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1483.  
  1484.    &lt;span class=&quot;c1&quot;&gt;// advance the ray until it finds a wall (a non zero tile)&lt;/span&gt;
  1485.    &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;wall&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1486.    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;wall&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1487.      &lt;span class=&quot;nx&quot;&gt;ray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rayCos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1488.      &lt;span class=&quot;nx&quot;&gt;ray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;raySin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1489.      &lt;span class=&quot;nx&quot;&gt;wall&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;floor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)][&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;floor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)];&lt;/span&gt;
  1490.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1491.  
  1492.    &lt;span class=&quot;c1&quot;&gt;// calculate the distance from the player to the wall hit&lt;/span&gt;
  1493.    &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;distance&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;pow&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;player&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;ray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;pow&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;player&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;ray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  1494.  
  1495.    &lt;span class=&quot;c1&quot;&gt;// calculate height at current x inversely proportional to the distance&lt;/span&gt;
  1496.    &lt;span class=&quot;nx&quot;&gt;wallHeights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;floor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;screen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;halfHeight&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;distance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  1497.  
  1498.    &lt;span class=&quot;c1&quot;&gt;// increment the angle for the next ray&lt;/span&gt;
  1499.    &lt;span class=&quot;nx&quot;&gt;rayAngle&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;incrementAngle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1500.  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1501.  
  1502.  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;wallHeights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1503. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1504. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  1505.  
  1506. &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
  1507.  
  1508. &lt;p&gt;For a ray casting implementation closer to the original Wolfenstein 3D one, &lt;a href=&quot;https://lodev.org/cgtutor/raycasting.html&quot;&gt;this series of tutorials&lt;/a&gt; is recommended.&lt;/p&gt;
  1509.  
  1510. &lt;p&gt;This routine was clearly the most challenging we tackled during this Hack Week, but we made a couple of decisions early on that reduced the complexity enough to be able to deliver something on time. First, we went with the most basic version of the algorithm that supports walls of solid colors instead of textures. Second, &lt;a href=&quot;https://github.com/qhool&quot;&gt;Josh Burroughs&lt;/a&gt; figured out ray casting separately, based on tutorials, instead of trying to make a line-by-line port of the original Carmack implementation (which, quoting Sanglard, is a “fully-handcrafted 740 lines of highly unorthodox and super efficient assembly code”) or its wolf4sdl counterpart (which is C but still relied heavily on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;goto&lt;/code&gt; statements and had a lot of global side-effects in addition to calculating wall heights).&lt;/p&gt;
  1511.  
  1512. &lt;p&gt;Here’s what the top-down view of the first Wolf map looked like after integrating it into the ray casting routine:&lt;/p&gt;
  1513.  
  1514. &lt;center&gt;
  1515.    &lt;img alt=&quot;E1M1 as viewed from the ray casting routine&quot; src=&quot;/images/post_images/rustestein/mapcast.png&quot; style=&quot;width: 100%&quot; /&gt;
  1516.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1517. &lt;/center&gt;
  1518.  
  1519. &lt;p&gt;The full implementation can be found &lt;a href=&quot;https://github.com/AdRoll/rustenstein/blob/3c4452b38dad2ba0f5f3d2c07209b89bd61e50c2/src/ray_caster.rs&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  1520.  
  1521. &lt;h2 id=&quot;world-rendering&quot;&gt;World Rendering&lt;/h2&gt;
  1522.  
  1523. &lt;p&gt;The 3D world is displayed by first splitting the screen horizontally in two halves, painting the upper half with a ceiling solid color and the lower half with a floor color. After that, a pixel column needs to be drawn with the height received from the ray casting algorithm for each horizontal coordinate. While the algorithm was still in development, we tested the rendering code with hard-coded walls:&lt;/p&gt;
  1524.  
  1525. &lt;center&gt;
  1526.    &lt;img alt=&quot;Doge weapon&quot; src=&quot;/images/post_images/rustestein/hardcoded.png&quot; /&gt;
  1527.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1528. &lt;/center&gt;
  1529.  
  1530. &lt;p&gt;Once the ray casting routine was implemented and fed with an actual Wolfenstein map, we got an array of wall heights for each pixel column in the screen, and we started to see the world:&lt;/p&gt;
  1531.  
  1532. &lt;center&gt;
  1533.    &lt;img alt=&quot;Doge weapon&quot; src=&quot;/images/post_images/rustestein/noshadow.png&quot; /&gt;
  1534.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1535. &lt;/center&gt;
  1536.  
  1537. &lt;p&gt;Although we haven’t implemented texture rendering, there are a couple of tricks that improve the appearance of the scene: using different colors for the  horizontal and vertical faces of a wall, and making the r, g, b components of each pixel inversely proportional to the distance to the player (which we know from the height of the wall), to generate a darkness effect:&lt;/p&gt;
  1538.  
  1539. &lt;center&gt;
  1540.    &lt;img alt=&quot;Doge weapon&quot; src=&quot;/images/post_images/rustestein/start.png&quot; /&gt;
  1541.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1542. &lt;/center&gt;
  1543.  
  1544. &lt;p&gt;The &lt;a href=&quot;[here](https://github.com/AdRoll/rustenstein/blob/3c4452b38dad2ba0f5f3d2c07209b89bd61e50c2/src/main.rs#L99-L145)&quot;&gt;world rendering code&lt;/a&gt;, then, looks like this:&lt;/p&gt;
  1545.  
  1546. &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;texture&lt;/span&gt;
  1547. &lt;span class=&quot;nf&quot;&gt;.with_lock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pitch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1548.     &lt;span class=&quot;c1&quot;&gt;// draw floor and ceiling colors&lt;/span&gt;
  1549.     &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;floor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;color_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;VGA_FLOOR_COLOR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  1550.     &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ceiling&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;color_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;VGA_CEILING_COLOR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  1551.     &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vm&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;view_height&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1552.  
  1553.     &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pix_width&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1554.         &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pix_height&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1555.             &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ceilings&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;darken_color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ceiling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vm&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pix_center&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  1556.             &lt;span class=&quot;nf&quot;&gt;put_pixel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pitch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ceilings&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  1557.         &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1558.         &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pix_height&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pix_height&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1559.             &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;floors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;darken_color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;floor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pix_center&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  1560.             &lt;span class=&quot;nf&quot;&gt;put_pixel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pitch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;floors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  1561.         &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1562.     &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1563.  
  1564.     &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pix_width&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1565.         &lt;span class=&quot;c1&quot;&gt;// use different colors for horizontal and vertical wall faces&lt;/span&gt;
  1566.         &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;color&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ray_hits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.horizontal&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1567.             &lt;span class=&quot;n&quot;&gt;color_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;150&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  1568.         &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1569.             &lt;span class=&quot;n&quot;&gt;color_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;155&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  1570.         &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  1571.  
  1572.         &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ray_hits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.height&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pix_center&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  1573.         &lt;span class=&quot;n&quot;&gt;color&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;darken_color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pix_center&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  1574.  
  1575.         &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pix_center&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pix_center&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1576.             &lt;span class=&quot;nf&quot;&gt;put_pixel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pitch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  1577.         &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1578.     &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1579. &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  1580.  
  1581. &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;darken_color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lightness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  1582.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1583.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lightness&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;f64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;max&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;f64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DARKNESS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1584.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;f64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1585.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;f64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1586.    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;f64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  1587.    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  1588. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  1589. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  1590.  
  1591. &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
  1592.  
  1593. &lt;h1 id=&quot;putting-it-all-together&quot;&gt;Putting it all together&lt;/h1&gt;
  1594.  
  1595. &lt;p&gt;A day before the demo, we only had pieces of the game: asset loading wasn’t finished, we had show-stopper bugs in map parsing and sprite rendering, and the ray casting engine was working in a 2D hardcoded map, separate from the rest of the project. In an amazing few final hours, everything just fell into place: the bugs were ironed out, the different components fit together and, with a few hacks and a lot of ugly code, we managed to put together an impressive-looking video, just in time for the Hack Week demo session. We even had time to throw in a last-minute face animation of the character! The whole experience reminded me of those stories about video game companies putting together one-off builds in a hurry, just to make it to E3 demos.&lt;/p&gt;
  1596.  
  1597. &lt;p&gt;This is still far from a functioning game, but it surpassed our most optimistic predictions from a few days before. During this week, we learned a fair deal about Rust, and we went further than we could have if we were working on our own. And the project ended up winning the technical award of the event!&lt;/p&gt;
  1598.  
  1599. &lt;center&gt;
  1600.    &lt;img alt=&quot;Doge weapon&quot; src=&quot;/images/post_images/rustestein/gameplay.gif&quot; /&gt;
  1601.    &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
  1602. &lt;/center&gt;
  1603.  
  1604. &lt;p&gt;The prototype is now &lt;a href=&quot;https://github.com/AdRoll/rustenstein&quot;&gt;published as open-source&lt;/a&gt;, although, as said, the code still needs significant clean-up. Since the project was a lot of fun and, in this first week, we managed to solve some of the most challenging parts (assets loading, ray casting, sprite and wall rendering), we’re eager to continue working on it. Some of the features that we could tackle next are:&lt;/p&gt;
  1605.  
  1606. &lt;ul&gt;
  1607.  &lt;li&gt;Rendering wall textures&lt;/li&gt;
  1608.  &lt;li&gt;Show and pick up items&lt;/li&gt;
  1609.  &lt;li&gt;Add enemies to the map, implement combat and enemy AI&lt;/li&gt;
  1610.  &lt;li&gt;Implement doors and keys&lt;/li&gt;
  1611.  &lt;li&gt;Implement push walls&lt;/li&gt;
  1612. &lt;/ul&gt;
  1613. </description>
  1614.    </item>
  1615.    
  1616.    
  1617.    
  1618.    <item>
  1619.      <title>Stay on top of your Erlang deps with our latest rebar3 plugin</title>
  1620.      <link>https://tech.nextroll.com/blog/dev/2021/09/01/erlang-rebar3-depup.html</link>
  1621.      <pubDate>Wed, 01 Sep 2021 00:00:00 -0700</pubDate>
  1622.      <author></author>
  1623.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2021/09/01/erlang-rebar3-depup</guid>
  1624.      <description>&lt;p&gt;Over the past few years, we have spent several &lt;a href=&quot;/blog/culture/2019/11/26/hackweek-at-nextroll.html&quot;&gt;hackweeks&lt;/a&gt; writing, improving, and maintaining multiple open-source rebar3 plugins for the Erlang community, such as &lt;a href=&quot;/blog/dev/2020/02/25/erlang-rebar3-format.html&quot;&gt;the formatter&lt;/a&gt;, &lt;a href=&quot;https://hex.pm/packages/rebar3_lint&quot;&gt;Elvis&lt;/a&gt;, or &lt;a href=&quot;/blog/dev/2021/01/06/erlang-rebar3-hank.html&quot;&gt;Hank&lt;/a&gt;. Particularly for Hank, we even wrote an &lt;a href=&quot;https://arxiv.org/abs/2107.08699&quot;&gt;academic paper&lt;/a&gt; that, with the help of &lt;a href=&quot;https://twitter.com/lauramcastro&quot;&gt;Laura M. Castro&lt;/a&gt;, I recently presented at the latest edition of the &lt;a href=&quot;https://icfp21.sigplan.org/home/erlang-2021&quot;&gt;ICFP Erlang Workshop&lt;/a&gt;.&lt;/p&gt;
  1625.  
  1626. &lt;p&gt;The most recent HackWeek was not an exception. That’s why I’m writing this article to introduce you to our latest plugin: &lt;a href=&quot;https://hex.pm/packages/rebar3_depup&quot;&gt;&lt;em&gt;rebar3_depup&lt;/em&gt;&lt;/a&gt;.&lt;/p&gt;
  1627.  
  1628. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10 minute read&lt;/code&gt;&lt;/p&gt;
  1629.  
  1630. &lt;hr /&gt;
  1631.  
  1632. &lt;center&gt;
  1633.    &lt;img alt=&quot;Cinderella&quot; src=&quot;/images/post_images/cinderella.gif&quot; /&gt;&lt;br /&gt;
  1634.    &lt;i&gt;Make your projects magically shine!&lt;/i&gt;
  1635. &lt;/center&gt;
  1636.  
  1637. &lt;h2 id=&quot;tldr&quot;&gt;TL;DR&lt;/h2&gt;
  1638. &lt;p&gt;In the same way as any other &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3&lt;/code&gt; plugin, you can skip this whole article, add &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{plugins, [rebar3_depup]}.&lt;/code&gt; to your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; and run the following command:&lt;/p&gt;
  1639.  
  1640. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;rebar3 &lt;span class=&quot;nb&quot;&gt;help &lt;/span&gt;update-deps
  1641. A rebar plugin to update dependencies
  1642. Usage: rebar3 update-deps &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&amp;lt;replace&amp;gt;]] &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&amp;lt;rebar_config&amp;gt;]]
  1643.                          &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;-a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&amp;lt;update_approx&amp;gt;]] &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&amp;lt;just_deps&amp;gt;]]
  1644.                          &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&amp;lt;just_plugins&amp;gt;]] &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;-h&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&amp;lt;just_hex&amp;gt;]]
  1645.  
  1646.  &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt;, &lt;span class=&quot;nt&quot;&gt;--replace&lt;/span&gt;        Directly replace values &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;rebar.config. The
  1647.                       default is to just show you what deps can be
  1648.                       updated because this is an experimental feature and
  1649.                       using it can mess up your formatting and comments.
  1650.                       &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;default: &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
  1651.  &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt;, &lt;span class=&quot;nt&quot;&gt;--rebar-config&lt;/span&gt;   File to analyze &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;default: rebar.config]
  1652.  &lt;span class=&quot;nt&quot;&gt;-a&lt;/span&gt;, &lt;span class=&quot;nt&quot;&gt;--update-approx&lt;/span&gt;  Update requirements starting with &lt;span class=&quot;s1&quot;&gt;&apos;~&amp;gt;&apos;&lt;/span&gt; as well as
  1653.                       the ones with a specific version. &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;default: &lt;span class=&quot;nb&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
  1654.  &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt;, &lt;span class=&quot;nt&quot;&gt;--just-deps&lt;/span&gt;      Only update deps &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;i.e. ignore plugins and
  1655.                       project_plugins&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;default: &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
  1656.  &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt;, &lt;span class=&quot;nt&quot;&gt;--just-plugins&lt;/span&gt;   Only update plugins and project_plugins &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;i.e.
  1657.                       ignore deps&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;default: &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
  1658.  &lt;span class=&quot;nt&quot;&gt;-h&lt;/span&gt;, &lt;span class=&quot;nt&quot;&gt;--just-hex&lt;/span&gt;       Only update hex packages, ignore git repos.
  1659.                       &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;default: &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1660.  
  1661. &lt;p&gt;Or you can be brave and just run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 update-deps&lt;/code&gt; (or even &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 update-deps --replace&lt;/code&gt; 😱). That will produce an output like the one below, listing all the dependencies you can update in your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; to get them to their latest versions.&lt;/p&gt;
  1662.  
  1663. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;…
  1664. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; rebar3_lint can be updated from 0.4.0 to 0.5.0
  1665. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; rebar3_hank can be updated from 1.1.2 to 1.1.4
  1666. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; rebar3_gpb_plugin can be updated from 2.21.0 to 2.22.0
  1667. …&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1668.  
  1669. &lt;p&gt;And that’s it! That’s our new plugin. It tells you what dependencies in your project have new versions available either on &lt;a href=&quot;https://hex.pm&quot;&gt;hex.pm&lt;/a&gt; or &lt;a href=&quot;https://github.com&quot;&gt;Github&lt;/a&gt;, so you can always stay up-to-date effortlessly.&lt;/p&gt;
  1670.  
  1671. &lt;hr /&gt;
  1672.  
  1673. &lt;p&gt;If you want to know more, keep reading. In the following sections, I’ll tell you:&lt;/p&gt;
  1674. &lt;ul&gt;
  1675.  &lt;li&gt;The background story behind it (i.e., what itch does it scratch for us).&lt;/li&gt;
  1676.  &lt;li&gt;The things that it can and cannot do since it’s not perfect yet.&lt;/li&gt;
  1677.  &lt;li&gt;How we automated its usage for all of our projects.&lt;/li&gt;
  1678. &lt;/ul&gt;
  1679.  
  1680. &lt;p&gt;Let’s dive into it, shall we?&lt;/p&gt;
  1681.  
  1682. &lt;h2 id=&quot;background-story&quot;&gt;Background Story&lt;/h2&gt;
  1683. &lt;p&gt;Within NextRoll, the RTB and Supply teams are responsible for maintaining more than 30 Erlang repositories in our organization’s &lt;a href=&quot;https://github.com/AdRoll&quot;&gt;Github account&lt;/a&gt;. Some of them are open-sourced, some are private. Most of them depend on one another. They form a quite complex multi-leveled dependency tree.&lt;/p&gt;
  1684.  
  1685. &lt;p&gt;These repositories have somewhat strict CI pipelines to check pull requests, and we want those pipelines to be &lt;em&gt;predictable&lt;/em&gt;. That’s why, four our dependencies and even for project plugins, we use specific versioning like &lt;a href=&quot;https://github.com/AdRoll/mero/blob/8c526966c487bdd0b5d04d69092e4315f666692e/rebar.config#L43-L47&quot;&gt;this one&lt;/a&gt;:&lt;/p&gt;
  1686.  
  1687. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;deps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dynamic_compile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;1.0.0&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]}.&lt;/span&gt;
  1688. &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  1689. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;project_plugins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  1690. &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rebar3_hex&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;~&amp;gt; 6.11.6&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  1691.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rebar3_format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;~&amp;gt; 1.0.1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  1692.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rebar3_lint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;~&amp;gt; 0.5.0&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  1693.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rebar3_hank&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;~&amp;gt; 1.1.4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1694.  
  1695. &lt;p&gt;That guarantees that a new version of any of those plugins won’t introduce new warnings/reports in any pull request, unless the PR also updates the plugin version.&lt;/p&gt;
  1696.  
  1697. &lt;p&gt;But, on the other hand, we also &lt;em&gt;maintain&lt;/em&gt; most of those plugins ourselves. And we do release new versions of them fairly regularly.&lt;/p&gt;
  1698.  
  1699. &lt;p&gt;So, whenever we shipped, for instance, a new version of Hank, we needed to manually go through all of our 25 repositories and generate a PR with the updated version in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; file. More often than not, those PRs included no changes outside of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt;, but they were necessary since we wanted to check &lt;em&gt;future&lt;/em&gt; PRs with the latest version of Hank.&lt;/p&gt;
  1700.  
  1701. &lt;p&gt;That process usually took us approximately &lt;strong&gt;6 hours&lt;/strong&gt;, it happened typically once a week, and it was a fully manual (i.e., error-prone) task.&lt;/p&gt;
  1702.  
  1703. &lt;p&gt;That’s why we decided to automate it as much as we could. Our very ambitious original goal was to build something along the lines of Github’s &lt;a href=&quot;https://dependabot.com/&quot;&gt;Dependabot&lt;/a&gt; for Erlang. It’s fair to say that we’re still far away from that, but we’re certainly moving in that direction.&lt;/p&gt;
  1704.  
  1705. &lt;h2 id=&quot;caveats-and-limitations&quot;&gt;Caveats and Limitations&lt;/h2&gt;
  1706.  
  1707. &lt;p&gt;Initially, we split the problem into two pieces:&lt;/p&gt;
  1708. &lt;ol&gt;
  1709.  &lt;li&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3&lt;/code&gt; plugin to detect (and possibly update) dependencies in Erlang projects.&lt;/li&gt;
  1710.  &lt;li&gt;An automatic pipeline that would run that plugin in all of our repositories periodically.&lt;/li&gt;
  1711. &lt;/ol&gt;
  1712.  
  1713. &lt;p&gt;The first item is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3_depup&lt;/code&gt;, and the second one would initially be just a &lt;a href=&quot;https://buildkite.com/&quot;&gt;Buildkite&lt;/a&gt; pipeline for now (It’s a &lt;strong&gt;Hack&lt;/strong&gt; week, after all). The main idea here is that eventually, using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3_depup&lt;/code&gt;, somebody can extend Github Dependabot to review Erlang projects, too. At that time, we will be able to throw away our poor-man version of the bot with no regrets.&lt;/p&gt;
  1714.  
  1715. &lt;p&gt;The plugin idea didn’t seem too complex at first glance. In a nutshell, its logic boils down to:&lt;/p&gt;
  1716. &lt;ol&gt;
  1717.  &lt;li&gt;Read &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3.config&lt;/code&gt; using Erlang’s &lt;a href=&quot;https://erlang.org/doc/man/file.html#consult-1&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;file:consult/1&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
  1718.  &lt;li&gt;Check for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deps&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;plugins&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;project_plugins&lt;/code&gt; lists.&lt;/li&gt;
  1719.  &lt;li&gt;For each of their elements, verify if there is a newer version of the package/library.&lt;/li&gt;
  1720.  &lt;li&gt;Either report the list of new versions or update &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; with them.&lt;/li&gt;
  1721. &lt;/ol&gt;
  1722.  
  1723. &lt;p&gt;But, as we found out multiple times while working on the formatter, Elvis and Hank, parsing and particularly modifying Erlang files automatically is never &lt;em&gt;as easy as it seems&lt;/em&gt;.&lt;/p&gt;
  1724.  
  1725. &lt;h3 id=&quot;parsing-and-formatting-config-files&quot;&gt;Parsing and Formatting Config Files&lt;/h3&gt;
  1726. &lt;p&gt;Being able to read config files using just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;file:consult/1&lt;/code&gt; is one of my personal favorite magic tricks from Erlang/OTP. I still remember my days as a C# engineer parsing extremely convoluted &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ini&lt;/code&gt; files with great sadness.
  1727. With just a tiny extra bit of complexity, you can also easily print out an config file full of Erlang terms to use for configuration:&lt;/p&gt;
  1728.  
  1729. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;c&quot;&gt;% Read the file
  1730. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Sections&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;consult&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;rebar.config&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  1731.  
  1732. &lt;span class=&quot;c&quot;&gt;% Process it
  1733. &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;NewSections&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;your_magic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Sections&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  1734.  
  1735. &lt;span class=&quot;c&quot;&gt;% The expected format includes a row for each section
  1736. &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Format&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;flatmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;NewSections&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  1737.  
  1738. &lt;span class=&quot;c&quot;&gt;% Print it out
  1739. &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;write_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;rebar.config&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;io_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Sections&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1740.  
  1741. &lt;p&gt;That’s &lt;em&gt;good&lt;/em&gt;, but it has two limitations that would become obvious the first time you try to test that procedure:&lt;/p&gt;
  1742. &lt;ol&gt;
  1743.  &lt;li&gt;It doesn’t preserve anything beyond Erlang terms. Therefore, if you had comments in your config files… They’ll be gone.&lt;/li&gt;
  1744.  &lt;li&gt;It doesn’t preserve any formatting decisions. It prints out the new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; file in the standard way that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;io_lib&lt;/code&gt; uses to format expressions which is typically not the nicest one to read them.&lt;/li&gt;
  1745. &lt;/ol&gt;
  1746.  
  1747. &lt;p&gt;To alleviate the first problem, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3_depup&lt;/code&gt; uses OTP’s &lt;a href=&quot;https://erlang.org/doc/man/erl_comment_scan.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_comment_scan&lt;/code&gt;&lt;/a&gt; and &lt;a href=&quot;https://erlang.org/doc/man/erl_recomment.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_recomment&lt;/code&gt;&lt;/a&gt;, but those tools are far from perfect. That’s why &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3_depup&lt;/code&gt; will emit some warnings if comments are removed or misplaced.&lt;/p&gt;
  1748.  
  1749. &lt;p&gt;For the second issue, well… We decided to ignore it entirely. We considered that every well-maintained Erlang project these days will be using a formatter, and therefore, developers can always run…&lt;/p&gt;
  1750.  
  1751. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;rebar3 &lt;span class=&quot;k&quot;&gt;do &lt;/span&gt;update-deps &lt;span class=&quot;nt&quot;&gt;--replace&lt;/span&gt;, format&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1752.  
  1753. &lt;h3 id=&quot;semantic-versioning&quot;&gt;Semantic Versioning&lt;/h3&gt;
  1754. &lt;p&gt;The next issue we faced was figuring out what it actually means to have a &lt;strong&gt;newer&lt;/strong&gt; version of a library. There are multiple ways to define &lt;em&gt;newer&lt;/em&gt; in this context. For instance, we could have checked if there was a &lt;em&gt;more recently&lt;/em&gt; published version of the package in hex.pm or a &lt;em&gt;more recently&lt;/em&gt; pushed tag for the repository in Github. But, in hopes of achieving more accuracy, we decided to trust &lt;a href=&quot;https://semver.org/&quot;&gt;SemVer&lt;/a&gt;, instead.&lt;/p&gt;
  1755.  
  1756. &lt;p&gt;For that, we used &lt;a href=&quot;https://github.com/starbelly&quot;&gt;Bryan Paxton&lt;/a&gt;’s &lt;a href=&quot;https://hex.pm/packages/verl&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verl&lt;/code&gt;&lt;/a&gt;. A simple library that can be used to compare two SemVer-compatible versions and, among other things, tell you if one is greater than the other. So that’s what &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3_depup&lt;/code&gt; considers as a &lt;strong&gt;newer&lt;/strong&gt; version of a library.&lt;/p&gt;
  1757.  
  1758. &lt;p&gt;That, in turn, means that the updater can only be used with properly-versioned hex.pm packages or Github tags. Since hex.pm also enforces semantic versioning, we think that this should be enough for most well-maintained projects, anyway.&lt;/p&gt;
  1759.  
  1760. &lt;h2 id=&quot;automation&quot;&gt;Automation&lt;/h2&gt;
  1761. &lt;p&gt;So… How do we &lt;em&gt;use&lt;/em&gt; this new tool here? And how can you use it in your own company?&lt;/p&gt;
  1762.  
  1763. &lt;p&gt;We use Buildkite pipelines for many automation tasks, and therefore writing a recurring pipeline to run the updater on all of our Erlang repos was the &lt;em&gt;natural&lt;/em&gt; (i.e., &lt;em&gt;easier&lt;/em&gt;) thing to do in this case. This is the main part of our pipeline now…&lt;/p&gt;
  1764.  
  1765. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot; data-lang=&quot;yaml&quot;&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Check&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Dependencies&apos;&lt;/span&gt;
  1766.  &lt;span class=&quot;na&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;check&quot;&lt;/span&gt;
  1767.  &lt;span class=&quot;na&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  1768.    &lt;span class=&quot;c1&quot;&gt;# Move to a well-known path that has a rebar.config file&lt;/span&gt;
  1769.    &lt;span class=&quot;c1&quot;&gt;# with the updater in project_plugins&lt;/span&gt;
  1770.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;pushd .buildkite/support&lt;/span&gt;
  1771.    &lt;span class=&quot;c1&quot;&gt;# Download rebar3&lt;/span&gt;
  1772.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;curl https://rebar3.s3.amazonaws.com/rebar3 -o ./rebar3&lt;/span&gt;
  1773.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;chmod +x rebar3&lt;/span&gt;
  1774.    &lt;span class=&quot;c1&quot;&gt;# git clone all the repositories we want to check&lt;/span&gt;
  1775.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;./clone-repos.sh&lt;/span&gt;
  1776.    &lt;span class=&quot;c1&quot;&gt;# Run the updater in each one of them and push the changes to github&lt;/span&gt;
  1777.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;for REPO in $$(ls repos); do ./update-repo.sh $${REPO}; done&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1778.  
  1779. &lt;p&gt;This is the core of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;update-repo.sh&lt;/code&gt;:&lt;/p&gt;
  1780.  
  1781. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;#!/bin/bash&lt;/span&gt;
  1782.  
  1783. &lt;span class=&quot;nb&quot;&gt;set&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-eu&lt;/span&gt;
  1784.  
  1785. &lt;span class=&quot;c&quot;&gt;# Generate a unique branch name&lt;/span&gt;
  1786. &lt;span class=&quot;nv&quot;&gt;BRANCH_NAME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;update-deps-&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;date&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;+%Y-%m-%d&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  1787.  
  1788. &lt;span class=&quot;c&quot;&gt;# Get into the project folder&lt;/span&gt;
  1789. &lt;span class=&quot;nb&quot;&gt;pushd &lt;/span&gt;repos/&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;
  1790.  
  1791. &lt;span class=&quot;c&quot;&gt;# Checkout the new branch from main&lt;/span&gt;
  1792. git checkout main
  1793. git checkout &lt;span class=&quot;nt&quot;&gt;-b&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;BRANCH_NAME&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;
  1794.  
  1795. &lt;span class=&quot;c&quot;&gt;# Go back to the root folder to use the main rebar.config file&lt;/span&gt;
  1796. &lt;span class=&quot;nb&quot;&gt;popd&lt;/span&gt;
  1797.  
  1798. &lt;span class=&quot;c&quot;&gt;# Update deps in the project&apos;s rebar.config file&lt;/span&gt;
  1799. ./rebar3 update-deps &lt;span class=&quot;nt&quot;&gt;--rebar-config&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;repos/&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;/rebar.config &lt;span class=&quot;nt&quot;&gt;--replace&lt;/span&gt;
  1800.  
  1801. &lt;span class=&quot;c&quot;&gt;# Get into the project folder again&lt;/span&gt;
  1802. &lt;span class=&quot;nb&quot;&gt;pushd &lt;/span&gt;repos/&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;
  1803.  
  1804. &lt;span class=&quot;c&quot;&gt;# Run the formatter, if possible&lt;/span&gt;
  1805. ../../rebar3 format &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;No formatters here :(&quot;&lt;/span&gt;
  1806.  
  1807. &lt;span class=&quot;c&quot;&gt;# If there are changes, generate a new pull request with them&lt;/span&gt;
  1808. git diff &lt;span class=&quot;nt&quot;&gt;--exit-code&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; ../../push-changes.sh &lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;BRANCH_NAME&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;
  1809.  
  1810. &lt;span class=&quot;c&quot;&gt;# Go back to the root folder again&lt;/span&gt;
  1811. &lt;span class=&quot;nb&quot;&gt;popd&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1812.  
  1813. &lt;p&gt;And, as you might have guessed, this is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; file on the root folder:&lt;/p&gt;
  1814.  
  1815. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plugins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rebar3_depup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1816.  
  1817. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  1818. &lt;p&gt;With this new tool and our very simple CI pipeline, we turned what used to be a &lt;strong&gt;6 hours&lt;/strong&gt; weekly manual task into a &lt;strong&gt;15 minutes&lt;/strong&gt; weekly pull request review and merge process. To be clear: the review and merge process was already part of those 6 hours before, too. But it was not just 15 minutes: since the process was manual, the review had to be more thorough and careful back then.&lt;/p&gt;
  1819.  
  1820. &lt;p&gt;Of course, &lt;em&gt;blindly&lt;/em&gt; updating libraries and plugins may break stuff. That’s why we don’t &lt;em&gt;just merge&lt;/em&gt; these updates. Instead, we generate pull requests that are reviewed by our CI pipelines and our developers as well. To be fair, that was exactly how we did this before the automation, so that part didn’t change. Only the review and pull request generation was automated. Which is great, because that’s what you can also automate in your own projects.&lt;/p&gt;
  1821.  
  1822. &lt;p&gt;And then… what do we do with the extra hours that we gained? Well… we use them to write blog posts like this one or create new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3&lt;/code&gt; plugins, of course!&lt;/p&gt;
  1823.  
  1824. &lt;center&gt;
  1825.    &lt;a href=&quot;https://xkcd.com/1319/&quot;&gt;
  1826.        &lt;img alt=&quot;XKCD&quot; style=&quot;width: 75%&quot; src=&quot;/images/post_images/xkcd-automation.png&quot; /&gt;
  1827.        &lt;br /&gt;
  1828.        &lt;i&gt;XKCD knows what we&apos;re talking about, as usual&lt;/i&gt;
  1829.    &lt;/a&gt;
  1830. &lt;/center&gt;
  1831.  
  1832. &lt;h2 id=&quot;contributing&quot;&gt;Contributing&lt;/h2&gt;
  1833. &lt;p&gt;As usual, this project is publicly available in Github. Please try it out on your projects and let us know about any &lt;a href=&quot;https://github.com/AdRoll/rebar3_depup/issues&quot;&gt;issues&lt;/a&gt; you find. We also gladly accept pull requests from the community :)&lt;/p&gt;
  1834. </description>
  1835.    </item>
  1836.    
  1837.    
  1838.    
  1839.    <item>
  1840.      <title>MaskedLARK Explainer</title>
  1841.      <link>https://tech.nextroll.com/blog/data-science/2021/08/03/masked-lark.html</link>
  1842.      <pubDate>Tue, 03 Aug 2021 00:00:00 -0700</pubDate>
  1843.      <author></author>
  1844.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data-science/2021/08/03/masked-lark</guid>
  1845.      <description>&lt;h1 id=&quot;background&quot;&gt;Background&lt;/h1&gt;
  1846.  
  1847. &lt;p&gt;&lt;a href=&quot;https://github.com/WICG/privacy-preserving-ads/blob/main/MaskedLARK.md&quot;&gt;MaskedLARK&lt;/a&gt; is the latest in a number of proposals for Google’s &lt;a href=&quot;https://www.chromium.org/Home/chromium-privacy/privacy-sandbox&quot;&gt;Privacy Sandbox&lt;/a&gt; initiative. This initiative aims to increase user privacy in the browser by removing third-party cookies while still preserving the ad-tech ecosystem that has evolved on top of those cookies.&lt;/p&gt;
  1848.  
  1849. &lt;p&gt;A key part of the current ecosystem are machine learning models, which determine how valuable a given ad is to be shown to a given user: Third-party cookies are used to track to whom ads are shown and whether these ads were effective. This data is then used to train machine learning models to better allocate future advertising dollars.&lt;/p&gt;
  1850.  
  1851. &lt;p&gt;Privacy advocates don’t want private companies to be able to track the individual history of users, while ad-tech companies want to efficiently spend advertising dollars with effective machine learning models. MaskedLARK provides a compromise: The machine learning models can be trained on individual user data, but neither the ad-tech companies nor any other third party will have access to this raw data. This post will go over the details using one of our NextRoll machine learning models as an example.&lt;/p&gt;
  1852.  
  1853. &lt;h1 id=&quot;the-problem-and-solution&quot;&gt;The Problem and Solution&lt;/h1&gt;
  1854.  
  1855. &lt;p&gt;The gold standard of ad-tech metrics today is Click-Through-Conversions. A Click-Through-Conversion occurs when a user clicks on an ad and then goes on to “convert” (purchase from the advertiser) within a given time span (usually 30 days). This metric is popular with advertisers as the conversion represents an actual sale while the click shows that the ad influenced the user and led to the conversion. In fact, the Click-Through-Conversion metric is so popular that most browsers promise to continue supporting it even in the absence of third-party cookies.
  1856. To optimize Click-Through-Conversions, at NextRoll we train a model for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;P(Convert | Click)&lt;/code&gt;, the probability that a user converts given that they clicked on an ad.&lt;/p&gt;
  1857.  
  1858. &lt;p&gt;The training data for this model consists of rows where each row represents a click and has a positive or negative label depending on whether that user went on to convert (Our actual model is a bit more complicated and uses &lt;a href=&quot;https://en.wikipedia.org/wiki/Survival_analysis&quot;&gt;survival models&lt;/a&gt; but I’ll ignore that in this post for simplicity).&lt;/p&gt;
  1859.  
  1860. &lt;p&gt;When a user clicks an ad we log a “click event” and when a user converts with one of our advertisers we log a “conversion event”. Since both of these events have access to and log the third party cookie, we can join the click and conversion events on the third party cookie to build a timeline of the user’s ad interactions and purchases to build our training data:&lt;/p&gt;
  1861.  
  1862. &lt;center&gt;
  1863.  &lt;img alt=&quot;User Browsing Timeline&quot; src=&quot;/images/post_images/user_timeline.png&quot; /&gt;
  1864. &lt;/center&gt;
  1865.  
  1866. &lt;p&gt;The MaskedLARK proposal creates this same training set but keeps it within the browser to ensure privacy. Then “helpers” (trusted third parties) perform the computations needed to update our model. We are able to train the model we need while never having access to the raw user data!&lt;/p&gt;
  1867.  
  1868. &lt;p&gt;But now won’t the helpers have access to the data? To solve this, MaskedLARK proposes a protocol using &lt;a href=&quot;https://medium.com/dropoutlabs/secret-sharing-explained-acf092660d97&quot;&gt;secret sharing masks&lt;/a&gt; where data is split among multiple helpers such that no individual helper can recover the raw data yet their combined computations still produce the desired model update. This will be explained in more detail below but at a high level the flow of MaskedLARK is as follows:&lt;/p&gt;
  1869. &lt;ol&gt;
  1870.  &lt;li&gt;We tell the browser which features &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt; we care about.&lt;/li&gt;
  1871.  &lt;li&gt;The browser keeps track of which ads the user clicks on and whether the user converts. It thereby creates the training data &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(x, y)&lt;/code&gt; for our model – one row for each click with features &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt; and label &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;y&lt;/code&gt; determined by whether the user goes on to convert or not.&lt;/li&gt;
  1872.  &lt;li&gt;Using secret sharing masks, the browser splits this training data into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;H&lt;/code&gt; chunks, one for each of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;H&lt;/code&gt; helpers.&lt;/li&gt;
  1873.  &lt;li&gt;Each helper computes the model update over the training data &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(x, y)&lt;/code&gt; it receieves.&lt;/li&gt;
  1874.  &lt;li&gt;We receive the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;H&lt;/code&gt; computations and combine them into a single computed model update with which we update our model.&lt;/li&gt;
  1875. &lt;/ol&gt;
  1876.  
  1877. &lt;h1 id=&quot;how-secret-sharing-masks-work&quot;&gt;How Secret Sharing Masks Work&lt;/h1&gt;
  1878.  
  1879. &lt;p&gt;Let’s take a brief detour to understand the idea behind secrete sharing via masks.&lt;/p&gt;
  1880.  
  1881. &lt;p&gt;Suppose I have a secret number, say &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;17&lt;/code&gt;, and a collection of numbers which sum to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1&lt;/code&gt;, say &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-4&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3&lt;/code&gt;. We will call these numbers “masks”. Suppose now I give each of three “helpers” the secret number multiplied by one of the masks. That is,&lt;/p&gt;
  1882. &lt;ul&gt;
  1883.  &lt;li&gt;Helper 1 receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;17 * (-4) = -68&lt;/code&gt;&lt;/li&gt;
  1884.  &lt;li&gt;Helper 2 receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;17 * 2 = 34&lt;/code&gt;&lt;/li&gt;
  1885.  &lt;li&gt;Helper 3 receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;17 * 3 = 51&lt;/code&gt;&lt;/li&gt;
  1886. &lt;/ul&gt;
  1887.  
  1888. &lt;p&gt;Then none of the three helpers knows my secret number and yet if these three numbers are added together they recover my secret number &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;17&lt;/code&gt;! Thus &lt;strong&gt;a collection of masks that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sum_to_1&lt;/code&gt; can be used to split a secret into pieces such that no party knows the secret and yet the secret can be recovered from the pieces&lt;/strong&gt;.&lt;/p&gt;
  1889.  
  1890. &lt;p&gt;Suppose now I take a different number, say &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;13&lt;/code&gt;, and a collection of “masks” which sum to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt;, say &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-5&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3&lt;/code&gt;. Again I give each helper the number multiplied by the corresponding mask. That is,&lt;/p&gt;
  1891. &lt;ul&gt;
  1892.  &lt;li&gt;Helper 1 receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;13 * (-5) = -65&lt;/code&gt;&lt;/li&gt;
  1893.  &lt;li&gt;Helper 2 receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;13 * 2 = 26&lt;/code&gt;&lt;/li&gt;
  1894.  &lt;li&gt;Helper 3 receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;13 * 3 = 39&lt;/code&gt;&lt;/li&gt;
  1895. &lt;/ul&gt;
  1896.  
  1897. &lt;p&gt;Now if we add these three numbers we get &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt;. And so &lt;strong&gt;a collection of masks that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sum_to_0&lt;/code&gt; can be used to create data that vanishes when combined together&lt;/strong&gt;.&lt;/p&gt;
  1898.  
  1899. &lt;p&gt;Finally, we can use both these strategies to not only hide secret numbers but also how many secret numbers there even are: Suppose we have some secret numbers – say just one, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;17&lt;/code&gt;. We can create some fake numbers – say just one, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;13&lt;/code&gt; – and pass to each helper the masked numbers using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sum_to_1&lt;/code&gt; masks for the secret numbers and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sum_to_0&lt;/code&gt; masks for the fake numbers. Continuing the example above and shuffling the passed data we could have e.g. that&lt;/p&gt;
  1900. &lt;ul&gt;
  1901.  &lt;li&gt;Helper 1 receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[-68, 65]&lt;/code&gt;&lt;/li&gt;
  1902.  &lt;li&gt;Helper 2 receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[26, 34]&lt;/code&gt;&lt;/li&gt;
  1903.  &lt;li&gt;Helper 3 receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[51, 39]&lt;/code&gt;&lt;/li&gt;
  1904. &lt;/ul&gt;
  1905.  
  1906. &lt;p&gt;Now each helper doesn’t know how many real secret numbers there are or even whether each number they receive corresponds to a real or fake secret number. And yet when all of these numbers are added together we get &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-68 + 65 + 26 + 34 + 51 + 39 = 17&lt;/code&gt;. That is we are still able to recover the &lt;em&gt;sum&lt;/em&gt; of all the true secret numbers!&lt;/p&gt;
  1907.  
  1908. &lt;h1 id=&quot;back-to-maskedlark&quot;&gt;Back to MaskedLARK&lt;/h1&gt;
  1909.  
  1910. &lt;p&gt;With the idea of masks in mind, let’s return to MaskedLARK. After step 2. the browser has the true training data &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(x_i, y_i)&lt;/code&gt; and to update our model we need to compute&lt;/p&gt;
  1911.  
  1912. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1913.  
  1914. &lt;p&gt;where G is the gradient for the loss function for our model (c.f. &lt;a href=&quot;https://en.wikipedia.org/wiki/Gradient_descent&quot;&gt;gradient descent&lt;/a&gt;).&lt;/p&gt;
  1915.  
  1916. &lt;p&gt;For each true data point &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(x_i, y_i)&lt;/code&gt;, the browser concocts a number of &lt;em&gt;fake&lt;/em&gt; data points. Then for each true data point, it computes a collection of masks that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sum_to_1&lt;/code&gt;, while for each fake data point, it computes a collection of masks that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sum_to_0&lt;/code&gt;. It then prepares for each helper a shuffled collection of data &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[(x_i, y_i, mask)]&lt;/code&gt; which is a mix of true and fake data points.&lt;/p&gt;
  1917.  
  1918. &lt;p&gt;Each helper &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;h&lt;/code&gt; receives its share of data and computes&lt;/p&gt;
  1919.  
  1920. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  1921.  &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mixed_data_for_helper_h&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1922.  
  1923. &lt;p&gt;Finally we (the Ad Server) receieve each of these &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;H&lt;/code&gt; computations, and sum them to recover the full model update! This works because the fake data masked with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sum_to_0&lt;/code&gt; masks drop out while the true data masked with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sum_to_1&lt;/code&gt; combine back into the original data. In mathematical terms,&lt;/p&gt;
  1924.  
  1925. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  1926.    &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mixed_data_for_helper_h&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;h&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;helpers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  1927.  &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y_i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  1928.  &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;desired&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gradient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;computation&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  1929.  
  1930. &lt;p&gt;As the raw history of the cookie is only contained in the browser, and each helper only gets masked information (including fake data), neither the helpers nor the Ad Server (NextRoll) get access to the user’s private browsing history. And yet, we still get the full power of a machine learning model trained on individual browsing history to allocate future advertising spend.&lt;/p&gt;
  1931.  
  1932. &lt;h1 id=&quot;extensions&quot;&gt;Extensions&lt;/h1&gt;
  1933.  
  1934. &lt;p&gt;The preceding sections explain the core idea of MaskedLARK in the context of computing a gradient update for a machine learning model. But the same framework could be used to calculate any aggregated quantity over individual user browsing histories. For example, we could compute “the total predicted number of conversions in California” and compare that to “the actual number of conversions in California” to see if our machine learning models are well-calibrated for California. The only limitation is that the computed quantity exhibit the desired interaction with the masks: fake data drops out, and the true data is recovered (this is expressed as a bilinearity condition in &lt;a href=&quot;https://github.com/WICG/privacy-preserving-ads/blob/main/MaskedLARK.md#framework&quot;&gt;this section&lt;/a&gt; of the official README)&lt;/p&gt;
  1935.  
  1936. &lt;p&gt;The MaskedLARK framework can also be augmented with additional privacy-preserving measures such as adding noise to the helpers’ aggregated quantities or using &lt;a href=&quot;https://en.wikipedia.org/wiki/K-anonymity&quot;&gt;k-anonymity&lt;/a&gt;&lt;/p&gt;
  1937.  
  1938. &lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
  1939.  
  1940. &lt;p&gt;NextRoll Engineering has been an active participant in the privacy sandbox meetings, which will transform the ad-tech ecosystem in the near future. By keeping at the forefront, we ensure that our machine learning models and systems will continue to be first in class, no matter which proposals are ultimately adopted. If working with large-scale real-time machine learning systems in an evolving environment catches your interest, be sure to check out our &lt;a href=&quot;https://www.nextroll.com/careers&quot;&gt;careers page&lt;/a&gt;.&lt;/p&gt;
  1941. </description>
  1942.    </item>
  1943.    
  1944.    
  1945.    
  1946.    <item>
  1947.      <title>UI smoke-testing with Cypress</title>
  1948.      <link>https://tech.nextroll.com/blog/dev/2021/05/11/frontend-smoke-testing-with-cypress.html</link>
  1949.      <pubDate>Tue, 11 May 2021 00:00:00 -0700</pubDate>
  1950.      <author></author>
  1951.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2021/05/11/frontend-smoke-testing-with-cypress</guid>
  1952.      <description>&lt;p&gt;In NextRoll, many teams are continuously working on different micro-frontend applications, many little bricks of our dashboards and products.&lt;/p&gt;
  1953.  
  1954. &lt;p&gt;To help these teams building homogeneous [and awesome] interfaces, we maintain a library of UI components. You can read about it in this &lt;a href=&quot;/blog/frontend/2015/11/05/rollup-shared-ui-components.html&quot;&gt;blog post&lt;/a&gt;.&lt;/p&gt;
  1955.  
  1956. &lt;p&gt;On the wave of this philosophy of DRY and shared tools, we decided to build a tool to simplify smoke testing of our UIs.&lt;/p&gt;
  1957.  
  1958. &lt;p&gt;A tool that we wanted to be portable, capable of running both on the developer machine and our CI infrastructures (&lt;a href=&quot;https://buildkite.com/&quot;&gt;BuildKite&lt;/a&gt; and &lt;a href=&quot;https://www.jenkins.io/&quot;&gt;Jenkins2&lt;/a&gt;), and integrated with our incident escalation and monitoring systems.&lt;/p&gt;
  1959.  
  1960. &lt;h2 id=&quot;smoke-testing&quot;&gt;Smoke testing&lt;/h2&gt;
  1961.  
  1962. &lt;p&gt;According to Wikipedia:&lt;/p&gt;
  1963. &lt;blockquote&gt;
  1964.  &lt;p&gt;Smoke testing is preliminary testing to reveal simple failures severe enough to, for example, reject a prospective software release.&lt;/p&gt;
  1965. &lt;/blockquote&gt;
  1966.  
  1967. &lt;p&gt;First, let’s define the type of tests we wanted to run with this tool.&lt;/p&gt;
  1968.  
  1969. &lt;p&gt;Our goal was to run simple tests on our UI to detect significant issues that would disrupt the user experience on our platform.
  1970. Examples of these major issues can be an application unable to load or a business-critical flow not working (e.g., creating a shining new campaign).&lt;/p&gt;
  1971.  
  1972. &lt;p&gt;We already have many different ways to ensure the stability of our services:&lt;/p&gt;
  1973. &lt;ul&gt;
  1974.  &lt;li&gt;Unit and functional tests of our code, even some complex end-to-end tests that we run on pull requests.&lt;/li&gt;
  1975.  &lt;li&gt;HTTP API testing to guarantee backward compatibility of our interfaces and monitor their availability.&lt;/li&gt;
  1976.  &lt;li&gt;Monitoring of different layers of our infrastructure, from the databases to the rate of errors of our Production API.&lt;/li&gt;
  1977. &lt;/ul&gt;
  1978.  
  1979. &lt;p&gt;All of these methods have always been fundamental to detect production issues or to prevent bad deploys. Still, most of them are responsible for testing or monitoring specific parts of the system without looking at the broader picture.&lt;/p&gt;
  1980.  
  1981. &lt;p&gt;One of the common issues with UIs is that they may be powered by many HTTP APIs, and this set of dependencies changes over time. Also, some specific sections of the same dashboard may rely on different APIs, or someone may enable A/B testing that introduces new dependencies.&lt;/p&gt;
  1982.  
  1983. &lt;p&gt;At a certain point, keeping track of the impact of your back-end systems’ deployments becomes impossible.&lt;/p&gt;
  1984.  
  1985. &lt;p&gt;Yeah, this should never happen due to the “agreements” that an HTTP API should maintain (no breaking changes), so your API testing should cover you. But having redundancy even on testing is still a good practice because you may not be able to catch all the issues with a single type of testing.&lt;/p&gt;
  1986.  
  1987. &lt;p&gt;Eventually, having fast and straightforward tests that would help to detect both a bad deployment and major outages of the UI was a double achievement for our goal.&lt;/p&gt;
  1988.  
  1989. &lt;h2 id=&quot;finding-the-best-framework-for-our-needs&quot;&gt;Finding the best framework for our needs&lt;/h2&gt;
  1990.  
  1991. &lt;p&gt;To achieve our goal, we started by studying the best UI testing frameworks on the market to choose the perfect one for our needs.&lt;/p&gt;
  1992.  
  1993. &lt;p&gt;We wanted a framework that didn’t require learning new languages or complex API, possibly based on technologies/libraries already known to our teams, to reduce the learning curve to adopt our tool.&lt;/p&gt;
  1994.  
  1995. &lt;p&gt;A reliable framework to allow us implementing stable tests, with the possibility to visualize the test execution.&lt;/p&gt;
  1996.  
  1997. &lt;p&gt;Many of us already had experiences with &lt;a href=&quot;https://www.selenium.dev/&quot;&gt;Selenium&lt;/a&gt; and the effort to maintain it and ensure our tests’ stability. This time, we wanted stable tests!&lt;/p&gt;
  1998.  
  1999. &lt;p&gt;We also used services like &lt;a href=&quot;https://newrelic.com/lp/synthetic-monitoring&quot;&gt;New Relic Synthetics&lt;/a&gt; and also the DataDog alternative, for a while.
  2000. They actually seem more stable than Selenium and with some excellent features, but, eventually, we were not happy to write our tests on their UIs without the ability to store them on our repositories.
  2001. This also required an extra effort during the deployments because we weren’t able to automatically update the tests, and we needed to jump on their UIs to address any change on our applications. After this experience, we decided that having the ability to adopt versioning of the code and automatically update our tests after each deployment were two key features for us to simplify the operations around the release of a new version of our product.&lt;/p&gt;
  2002.  
  2003. &lt;p&gt;After some experiments, we found our best candidate: Cypress!&lt;/p&gt;
  2004.  
  2005. &lt;h2 id=&quot;cypressio&quot;&gt;Cypress.io&lt;/h2&gt;
  2006.  
  2007. &lt;p&gt;Cypress is an open-source framework for end-to-end testing that you can find on &lt;a href=&quot;https://www.cypress.io/&quot;&gt;https://cypress.io&lt;/a&gt;.&lt;/p&gt;
  2008.  
  2009. &lt;p&gt;It has a brilliant architecture that does not rely on Selenium. Instead, Cypress is executed in the same run loop as your application, which allows the framework to have native access to the DOM, the window, your application. This approach could also allow us to intercept and modify HTTP requests efficiently, and directly connect Cypress to your Redux store.&lt;/p&gt;
  2010.  
  2011. &lt;p&gt;You can read more about how Cypress works &lt;a href=&quot;https://www.cypress.io/how-it-works&quot;&gt;on their website&lt;/a&gt;.&lt;/p&gt;
  2012.  
  2013. &lt;p&gt;Users can write tests in JavaScript or TypeScript, and Cypress already provides well-known testing libraries like &lt;a href=&quot;https://mochajs.org/&quot;&gt;Mocha&lt;/a&gt;, &lt;a href=&quot;https://www.chaijs.com/&quot;&gt;Chai&lt;/a&gt;, and &lt;a href=&quot;https://sinonjs.org/&quot;&gt;Sinon&lt;/a&gt;.&lt;/p&gt;
  2014.  
  2015. &lt;p&gt;It supports Mocha reporters, like &lt;a href=&quot;https://github.com/adamgruber/mochawesome&quot;&gt;Mochawesome&lt;/a&gt;:&lt;/p&gt;
  2016.  
  2017. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-json&quot; data-lang=&quot;json&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  2018.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;reporter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;mochawesome&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  2019.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;reporterOptions&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  2020.        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;reportDir&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;output/reports&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  2021.        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;overwrite&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  2022.        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;html&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  2023.        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;json&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  2024.    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  2025. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2026.  
  2027. &lt;p&gt;&lt;em&gt;Part of our cypress.json&lt;/em&gt;&lt;/p&gt;
  2028.  
  2029. &lt;p&gt;It also records each test to a different mp4 file, so you can watch the test execution and spot any UI issue.&lt;/p&gt;
  2030.  
  2031. &lt;h2 id=&quot;writing-a-test&quot;&gt;Writing a test&lt;/h2&gt;
  2032.  
  2033. &lt;p&gt;Cypress API are very simple, you don’t even need to have a look at their documentation to understand this test:&lt;/p&gt;
  2034.  
  2035. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-typescript&quot; data-lang=&quot;typescript&quot;&gt;&lt;span class=&quot;nx&quot;&gt;describe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;An example&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  2036.  &lt;span class=&quot;nx&quot;&gt;before&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  2037.    &lt;span class=&quot;c1&quot;&gt;// Here you can setup your tests.&lt;/span&gt;
  2038.    &lt;span class=&quot;c1&quot;&gt;// As example, you could log in to your application.&lt;/span&gt;
  2039.    &lt;span class=&quot;nx&quot;&gt;prepareYourTest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  2040.  &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  2041.  
  2042.  &lt;span class=&quot;nx&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;Should load&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  2043.    &lt;span class=&quot;nx&quot;&gt;cy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;visit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  2044.      &lt;span class=&quot;s2&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Cypress&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;HOST&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/test-url`&lt;/span&gt;
  2045.    &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  2046.    &lt;span class=&quot;c1&quot;&gt;// After .visit(), we want to check if the next page H1&lt;/span&gt;
  2047.    &lt;span class=&quot;c1&quot;&gt;// contains &quot;Hi!&quot;.&lt;/span&gt;
  2048.    &lt;span class=&quot;c1&quot;&gt;// As you can see, we don&apos;t need to wait for&lt;/span&gt;
  2049.    &lt;span class=&quot;c1&quot;&gt;// the page to be ready, this is on Cypress which&lt;/span&gt;
  2050.    &lt;span class=&quot;c1&quot;&gt;// will automatically wait for your H1 to be visible&lt;/span&gt;
  2051.    &lt;span class=&quot;c1&quot;&gt;// (or,  if your H1 doesn&apos;t appear, it will fail after a timeout).&lt;/span&gt;
  2052.    &lt;span class=&quot;nx&quot;&gt;cy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;h1&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;should&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;contain&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;Hi!&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  2053.    &lt;span class=&quot;c1&quot;&gt;// Let&apos;s also confirm that we are on the right URL.&lt;/span&gt;
  2054.    &lt;span class=&quot;nx&quot;&gt;cy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;should&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;include&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;test-url&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  2055.  
  2056.    &lt;span class=&quot;c1&quot;&gt;// Now, let&apos;s find an entry on our navbar,&lt;/span&gt;
  2057.    &lt;span class=&quot;c1&quot;&gt;// and let&apos;s click on it.&lt;/span&gt;
  2058.    &lt;span class=&quot;nx&quot;&gt;cy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;div.main-navbar&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  2059.      &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;contains&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;Section 1&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  2060.      &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;click&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  2061.    &lt;span class=&quot;c1&quot;&gt;// Here too, we don&apos;t need to write code to&lt;/span&gt;
  2062.    &lt;span class=&quot;c1&quot;&gt;// wait for our application to be&lt;/span&gt;
  2063.    &lt;span class=&quot;c1&quot;&gt;// ready: Cypress will take care of it.&lt;/span&gt;
  2064.    &lt;span class=&quot;nx&quot;&gt;cy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;div.main-page&amp;gt;h2&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;should&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  2065.      &lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;contain&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  2066.      &lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;It works!&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;
  2067.    &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  2068.  &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  2069. &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2070.  
  2071. &lt;p&gt;See? No “wait until visible” commands!&lt;/p&gt;
  2072.  
  2073. &lt;h2 id=&quot;distribute-to-our-teams&quot;&gt;Distribute to our teams&lt;/h2&gt;
  2074.  
  2075. &lt;p&gt;Finding the framework was not the end of our journey. We needed to make the setup as easier as possible and abstract some Cypress complexities.&lt;/p&gt;
  2076.  
  2077. &lt;h3 id=&quot;one-docker-image-to-rule-them-all&quot;&gt;One Docker image to rule them all&lt;/h3&gt;
  2078.  
  2079. &lt;p&gt;We decided that distributing our tool as a Docker image would simplify the maintenance and the adoption of the tool.&lt;/p&gt;
  2080.  
  2081. &lt;p&gt;Also, Cypress requires some setup and tuning, and we wanted to abstract this to other teams.&lt;/p&gt;
  2082.  
  2083. &lt;p&gt;Shipping our tool as a Docker image also allows us to solve this complexity by encapsulating the Cypress instrumentation and post-processing logic. Engineers will need to provide the required config files and run the Docker command they find on the documentation.&lt;/p&gt;
  2084.  
  2085. &lt;p&gt;Our base image is built from one of the official Docker images for Cypress, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cypress/browsers:node12.19.0-chrome86-ff82&lt;/code&gt;. You can find the complete list of images they prepare here: &lt;a href=&quot;https://github.com/cypress-io/cypress-docker-images&quot;&gt;https://github.com/cypress-io/cypress-docker-images&lt;/a&gt;.&lt;/p&gt;
  2086.  
  2087. &lt;p&gt;Let’s have a look at the lifecycle of a test execution:&lt;/p&gt;
  2088. &lt;center&gt;
  2089.  &lt;img alt=&quot;Flow of the smoke testing tool&quot; src=&quot;/images/post_images/smoketester/smoketester_flow.png&quot; /&gt;
  2090. &lt;/center&gt;
  2091.  
  2092. &lt;p&gt;This approach requires teams to have a minimal boilerplate into the application:&lt;/p&gt;
  2093. &lt;ul&gt;
  2094.  &lt;li&gt;A Dockerfile, to download our Docker image and load their tests into it;&lt;/li&gt;
  2095.  &lt;li&gt;Two docker-compose.yml files, one for the developer machine and another for the CI;
  2096.    &lt;ul&gt;
  2097.      &lt;li&gt;The main difference between these two files is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ipc: host&lt;/code&gt; on the developer configuration (and a volume so they can see the output of their tests).&lt;/li&gt;
  2098.    &lt;/ul&gt;
  2099.  &lt;/li&gt;
  2100.  &lt;li&gt;A JSON file representing the main configuration, which could be different on each environment to allow configuring different hosts and incident escalation policies (well, I don’t think that a failure on Staging would be worth paging an engineer during the night);&lt;/li&gt;
  2101.  &lt;li&gt;The test files.&lt;/li&gt;
  2102. &lt;/ul&gt;
  2103.  
  2104. &lt;p&gt;This allows us to store our tests on the repository and edit them using the editor we use every day. Our CI can automatically use the correct version of the tests without requiring us to perform manual adjustments after the deployment.&lt;/p&gt;
  2105.  
  2106. &lt;p&gt;Another benefit is that our Docker image is fully extensible, just by overriding the files contained into the base Docker image. This allows other teams to build their own Cypress tasks, fixtures, and install more plugins.&lt;/p&gt;
  2107.  
  2108. &lt;h3 id=&quot;test-utilities&quot;&gt;Test utilities&lt;/h3&gt;
  2109.  
  2110. &lt;p&gt;Most of our UIs are accessible only after signing in to our application, so we built a simple utility that can be run from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;before&lt;/code&gt; hook to instrument the session by signing in using the user that is available in the Cypress’ environment.&lt;/p&gt;
  2111.  
  2112. &lt;p&gt;For teams that want to use TypeScript, we also published a little NPM package to enable type hinting for Cypress and these utilities on their editor.&lt;/p&gt;
  2113.  
  2114. &lt;h2 id=&quot;monitoring-and-incident-escalation&quot;&gt;Monitoring and incident escalation&lt;/h2&gt;
  2115.  
  2116. &lt;p&gt;Having a smoke testing tool not being integrated with the services we use every day would have been pretty useless.&lt;/p&gt;
  2117.  
  2118. &lt;p&gt;For this reason, we enable the Mochawesome reporter support in Cypress, which creates a JSON file per test file.&lt;/p&gt;
  2119.  
  2120. &lt;p&gt;Then, we built a simple Python script to post-process these reports:&lt;/p&gt;
  2121. &lt;ol&gt;
  2122.  &lt;li&gt;It uploads all the artifacts (reports, MP4 videos, and screenshots) to a secure storage;&lt;/li&gt;
  2123.  &lt;li&gt;It emits key metrics (test duration and result) to DataDog;&lt;/li&gt;
  2124.  &lt;li&gt;In case of failures on tests integrated with PagerDuty, it triggers an incident to escalate the issue quickly.&lt;/li&gt;
  2125.  &lt;li&gt;Optionally, it posts a message on Slack with a summary of the results:&lt;/li&gt;
  2126. &lt;/ol&gt;
  2127.  
  2128. &lt;center&gt;
  2129.  &lt;img alt=&quot;Example of a Slack notification&quot; src=&quot;/images/post_images/smoketester/smoketester_slack.png&quot; /&gt;
  2130. &lt;/center&gt;
  2131.  
  2132. &lt;h3 id=&quot;who-is-monitoring-the-smoke-tester&quot;&gt;Who is monitoring the smoke tester?&lt;/h3&gt;
  2133.  
  2134. &lt;p&gt;Having a smoke-tester running on our infrastructure means that we should monitor its stability, too, to ensure it is appropriately testing our applications.&lt;/p&gt;
  2135.  
  2136. &lt;p&gt;This was as easy as setting a &lt;a href=&quot;https://www.datadoghq.com/&quot;&gt;DataDog&lt;/a&gt; monitor that would alert if the smoke testing tool doesn’t emit metrics for a while.&lt;/p&gt;
  2137.  
  2138. &lt;h3 id=&quot;reducing-the-noise&quot;&gt;Reducing the noise&lt;/h3&gt;
  2139.  
  2140. &lt;p&gt;Nobody wants to wake up in the middle of the night because they have been paged for a false alarm. It’s a pain, which reduces the confidence in the smoke testing. &lt;br /&gt;
  2141. We can mitigate this situation by applying a retry logic, and we are doing this by rerunning the whole suite if it fails. In the future, we plan to retry just the tests that failed to reduce the notification delay in case of serious issues.&lt;/p&gt;
  2142.  
  2143. &lt;p&gt;The tool also takes care not to repeatedly send the same notifications if a specific test keeps failing. It’s capable of resolving the PagerDuty incident once the test succeeds again.&lt;/p&gt;
  2144.  
  2145. &lt;p&gt;Also, it’s not very nice when you’ve been paged for an issue that is not under your domain, and you just need to point to the right team. &lt;br /&gt;This can easily happen on UIs that have multiple ownerships.
  2146. In our case, a team is usually in charge of the application, and various teams own sub-sections or specific features. &lt;br /&gt;
  2147. Mitigating this was a little bit harder than implementing a retry logic. &lt;br /&gt;
  2148. We chose to resolve this by implementing a hierarchy model that prevents incident escalation to the branches of a failed node.&lt;/p&gt;
  2149.  
  2150. &lt;h3 id=&quot;hierarchy-of-tests&quot;&gt;Hierarchy of tests&lt;/h3&gt;
  2151.  
  2152. &lt;p&gt;Let’s see an example:&lt;/p&gt;
  2153. &lt;center&gt;
  2154.  &lt;img alt=&quot;Hierachy of the smoke testing tests&quot; src=&quot;/images/post_images/smoketester/smoketester_hierarchy_1.svg&quot; /&gt;
  2155. &lt;/center&gt;
  2156.  
  2157. &lt;p&gt;We have three teams:&lt;/p&gt;
  2158. &lt;ul&gt;
  2159.  &lt;li&gt;Team A is the team owning the application. They are responsible for any significant outage that causes the UI not to load at all due to, for instance, infrastructure issues.&lt;/li&gt;
  2160.  &lt;li&gt;Team B and C own sub-sections of the UI, either critical (P0) or non-critical (P1 or lower) sections.&lt;/li&gt;
  2161. &lt;/ul&gt;
  2162.  
  2163. &lt;p&gt;Let’s assume the whole UI doesn’t load at all because the web service is down.
  2164. If the root fails, our tool doesn’t notify the other teams:&lt;/p&gt;
  2165.  
  2166. &lt;center&gt;
  2167.  &lt;img alt=&quot;Hierachy of the smoke testing tests - failed test&quot; src=&quot;/images/post_images/smoketester/smoketester_hierarchy_2.svg&quot; /&gt;
  2168. &lt;/center&gt;
  2169.  
  2170. &lt;p&gt;Team A will be alerted of the issue, while team B and C will sleep peacefully because the failures on their tests were caused by the application outage.&lt;/p&gt;
  2171.  
  2172. &lt;p&gt;This concept of hierarchy allows us to define custom notification rules for each different team.&lt;/p&gt;
  2173.  
  2174. &lt;h2 id=&quot;case-study-smoke-testing-the-email-verification&quot;&gt;Case study: smoke testing the email verification&lt;/h2&gt;
  2175.  
  2176. &lt;p&gt;As we recently started to roll out an email verification process during the signup flow, we wanted to ensure its stability over time.&lt;/p&gt;
  2177.  
  2178. &lt;p&gt;To properly test this flow, we set up a smoke testing instance that’s able to connect to the email server to grab the verification token and complete the process. This was as easy as declaring a new &lt;a href=&quot;https://docs.cypress.io/api/commands/task&quot;&gt;Cypress task&lt;/a&gt; that relies on the NPM package &lt;a href=&quot;https://github.com/levz0r/gmail-tester&quot;&gt;gmail-tester&lt;/a&gt; to return the email messages from our Gmail-powered account.&lt;/p&gt;
  2179.  
  2180. &lt;p&gt;As this new flow has been implemented on various back-end and UIs, this smoke testing tool is a nice solution for us to ensure that the interactions between these systems will continue to be stable over time.&lt;/p&gt;
  2181.  
  2182. &lt;h2 id=&quot;conclusions&quot;&gt;Conclusions&lt;/h2&gt;
  2183.  
  2184. &lt;p&gt;Almost one year has passed since the first release of this tool, and various teams are currently using it for their own applications.&lt;/p&gt;
  2185.  
  2186. &lt;h3 id=&quot;was-cypress-a-good-choice&quot;&gt;Was Cypress a good choice?&lt;/h3&gt;
  2187.  
  2188. &lt;p&gt;Yes, for many reasons:&lt;/p&gt;
  2189.  
  2190. &lt;ul&gt;
  2191.  &lt;li&gt;We managed to easily integrate it into our workflow.&lt;/li&gt;
  2192.  &lt;li&gt;Teams are adopting our tool with low friction, and some of them also made some contributions to add new Cypress plugins or features to it.&lt;/li&gt;
  2193.  &lt;li&gt;Overall, our simple tests look stable.&lt;/li&gt;
  2194. &lt;/ul&gt;
  2195.  
  2196. &lt;p&gt;Also, the ability to easily see the test executions by looking at the recordings is a key feature for us, and it actually helped us during nightly incidents.&lt;/p&gt;
  2197.  
  2198. &lt;h3 id=&quot;would-you-expand-the-scope-of-the-tool&quot;&gt;Would you expand the scope of the tool?&lt;/h3&gt;
  2199.  
  2200. &lt;p&gt;Right now, we are only using it for smoke testing, and we don’t plan to support other types of tests like E2E or acceptance testing.&lt;/p&gt;
  2201.  
  2202. &lt;p&gt;Even though Cypress is far above Selenium in terms of stability, it’s not so easy to define new tests for complex scenarios. So, following the &lt;a href=&quot;https://martinfowler.com/articles/practical-test-pyramid.html&quot;&gt;test pyramid&lt;/a&gt; and creating tests only for the critical UI paths is, in our experience, the best approach.&lt;/p&gt;
  2203.  
  2204. &lt;h2 id=&quot;next-steps&quot;&gt;Next steps&lt;/h2&gt;
  2205.  
  2206. &lt;p&gt;We always want to improve our tools, so we already have some ideas:&lt;/p&gt;
  2207.  
  2208. &lt;ul&gt;
  2209.  &lt;li&gt;Rework the retry logic (we mentioned this before) only to retry the failed tests, as right now, we are retrying the entire suite.&lt;/li&gt;
  2210.  &lt;li&gt;Play with the &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Performance_API&quot;&gt;Performance API&lt;/a&gt; to extract more metrics regarding our UIs, like the Resource Timing, which allows us to retrieve network timing data regarding the loading of the application’s resources.&lt;/li&gt;
  2211. &lt;/ul&gt;
  2212. </description>
  2213.    </item>
  2214.    
  2215.    
  2216.    
  2217.    <item>
  2218.      <title>The Winner&apos;s Curse</title>
  2219.      <link>https://tech.nextroll.com/blog/data-science/2021/03/30/winners-curse.html</link>
  2220.      <pubDate>Tue, 30 Mar 2021 00:00:00 -0700</pubDate>
  2221.      <author></author>
  2222.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data-science/2021/03/30/winners-curse</guid>
  2223.      <description>&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
  2224.  
  2225. &lt;p&gt;Anytime you have an auction, you have the potential for the
  2226. &lt;a href=&quot;https://en.wikipedia.org/wiki/Winner%27s_curse&quot;&gt;winner’s curse&lt;/a&gt;,
  2227. a simple, yet surprising, statistical phenomenon. Despite its
  2228. simplicity, the effect can have significant financial implications
  2229. for anyone participating in auctions.&lt;/p&gt;
  2230.  
  2231. &lt;p&gt;We care about the winner’s curse at NextRoll because we
  2232. participate in about 100,000 ad auctions per second. In this
  2233. post, we explain the curse and explore the way it manifests
  2234. itself in our ad-buying systems.&lt;/p&gt;
  2235.  
  2236. &lt;h1 id=&quot;understanding-the-curse&quot;&gt;Understanding the Curse&lt;/h1&gt;
  2237.  
  2238. &lt;p&gt;Let’s pretend Alice is auctioning off a jar containing exactly
  2239. $25 of spare change she has collected. \(N\) of Alice’s friends
  2240. participate in the auction, hoping to purchase Alice’s jar of
  2241. change. No one can open the jar before the auction, but each of
  2242. Alice’s friends can inspect the jar to estimate the total worth
  2243. of the coins contained within. After completing their valuations,
  2244. each of Alice’s friends submit a bid.&lt;/p&gt;
  2245.  
  2246. &lt;p&gt;Alice’s friends assume one of them can make a few dollars here.
  2247. However, it may be the case that one of them overestimates the
  2248. value of the coins in the jar. As a result, that person may bid
  2249. too much into Alice’s auction. If this happens, that person may,
  2250. depending on the auction dynamics and the other bids, end up
  2251. paying Alice more for the jar than the value of the coins
  2252. contained within.&lt;/p&gt;
  2253.  
  2254. &lt;p&gt;It turns out that the winner of Alice’s auction will end up
  2255. overpaying for the jar of change much more frequently than we
  2256. would naively expect; this is the winner’s curse. To see how
  2257. this works, let’s denote Alice’s \(i\)th friend’s valuation as
  2258. follows:
  2259. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  2260. %&amp;lt;![CDATA[
  2261. \begin{equation}
  2262. V_i = 25 + \epsilon_i
  2263. \end{equation}
  2264. %]]&amp;gt;
  2265. &lt;/script&gt;
  2266. where \(\epsilon_i\) is the error each friend makes in their
  2267. valuation. The highest valuation of the jar of coins among
  2268. Alice’s \(N\) friends will be:
  2269. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  2270. %&amp;lt;![CDATA[
  2271. \begin{equation}
  2272. \max(V_1, ... ,V_N) = 25 + \max(\epsilon_1, ... , \epsilon_N)
  2273. \end{equation}
  2274. %]]&amp;gt;
  2275. &lt;/script&gt;&lt;/p&gt;
  2276.  
  2277. &lt;p&gt;This highest valuation is very likely to be larger than $25. To
  2278. see why, let’s assume Alice’s friends are equally likely to
  2279. overestimate the value of the coins as they are to underestimate
  2280. it. In this case, for the highest valuation to be less than $25,
  2281. all \(N\) of Alice’s friend’s bids have to underestimate the value
  2282. of the jar of coins. This happens only \(\frac{1}{2^N}\) of the
  2283. time.&lt;/p&gt;
  2284.  
  2285. &lt;p&gt;This discussion of the highest valuation is important because,
  2286. typically, the person with the highest valuation will win the
  2287. auction. Let’s call this person with the highest valuation, whom
  2288. we expect to win the auction, Bob. If Bob wins the jar of change
  2289. and tallies up the money, he is probably in for a nasty surprise.
  2290. Almost every time, the jar will have less money than he anticipated
  2291. it having. Many times, this effect is so pronounced that not only
  2292. will Bob have not made as much profit as he had anticipated, he
  2293. will have lost money! Just like that, Bob has fallen victim to
  2294. the curse!&lt;/p&gt;
  2295.  
  2296. &lt;h2 id=&quot;simulating-the-curse&quot;&gt;Simulating the Curse&lt;/h2&gt;
  2297.  
  2298. &lt;p&gt;To further illustrate this effect, we can make some concrete
  2299. assumptions and run a simulation. Let’s pretend the error each
  2300. of Alice’s friends make when estimating the jar of coins’ value
  2301. is distributed normally with a mean of $0 and a standard
  2302. deviation of $3. That is, \(\epsilon_i \sim \mathcal{N}(0, 3^2)\).
  2303. Let’s pretend Alice is running a second-price auction, in which
  2304. the bidders naively assume bidding their noisy valuation is
  2305. optimal. Below we see what happens with 4, 7, and 10 of Alice’s
  2306. friends bidding into her auction.&lt;/p&gt;
  2307.  
  2308. &lt;p&gt;&lt;img src=&quot;/images/post_images/winners_curse/winners_curse_simulation.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
  2309.  
  2310. &lt;p&gt;In this simulation, Bob usually paid more for the jar of coins
  2311. than it is worth. This happened almost every time with just ten
  2312. of Alice’s friends participating in the auction. Also notice,
  2313. the more participants the auction has, the worse the winner’s
  2314. curse becomes.&lt;/p&gt;
  2315.  
  2316. &lt;p&gt;This shows that when you’ve estimated the value of an item in an
  2317. auction, you should reduce your bid, not just to account for
  2318. auction dynamics, but also to account for uncertainty in your
  2319. valuation. In particular, despite conventional wisdom saying to
  2320. bid your true valuation in a second-price auction, you should
  2321. actually reduce your bid a bit in most real-world situations.&lt;/p&gt;
  2322.  
  2323. &lt;h1 id=&quot;observing-of-the-curse&quot;&gt;Observing of the Curse&lt;/h1&gt;
  2324.  
  2325. &lt;p&gt;Now that we understand the winner’s curse, we can see what it
  2326. looks like in some real data. There are two places we observe
  2327. the winner’s curse at NextRoll. When an ad exchange notifies
  2328. us of an ad opportunity, we have to select an ad for which to
  2329. bid. This is done via an internal auction and is the first place
  2330. this effect manifests. Next, we submit the highest internal bid
  2331. to the ad exchange, and an auction is run amongst bidders like
  2332. NextRoll. This external auction is the second place we observe
  2333. the winner’s curse.&lt;/p&gt;
  2334.  
  2335. &lt;p&gt;To be clear, I highlight these instances of the winner’s curse
  2336. because the effect is interesting, not because these cases are
  2337. particularly problematic. Unlike in the example of Bob losing
  2338. money by purchasing Alice’s jar of coins, the impact of the
  2339. winner’s curse on NextRoll’s bidding systems is minor overall.&lt;/p&gt;
  2340.  
  2341. &lt;h2 id=&quot;the-internal-auction&quot;&gt;The Internal Auction&lt;/h2&gt;
  2342.  
  2343. &lt;p&gt;The internal auction is where we decide which ad from our
  2344. thousands of advertisers will be submitted to the external
  2345. auction. Some of this decision is made for us by simply applying
  2346. the advertiser’s targeting criteria. Selecting among the remaining
  2347. advertisers’ eligible ads is done with an auction that has two steps.&lt;/p&gt;
  2348.  
  2349. &lt;p&gt;First, we select the most valuable eligible ad for each advertiser.
  2350. As you can see below, the more ads from which we have to select for
  2351. a given advertiser, the more overvalued the selected ad is. This is
  2352. analogous to the winner of Alice’s auction paying more for the jar
  2353. of coins when there are more bidders. This is the winner’s curse in
  2354. action!&lt;/p&gt;
  2355.  
  2356. &lt;p&gt;&lt;img src=&quot;/images/post_images/winners_curse/ad_selection.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
  2357.  
  2358. &lt;p&gt;The second step in the internal auction is to choose the most
  2359. valuable ad amongst each advertisers’ best candidates. As in the
  2360. case above, the more advertisers targeting the opportunity, the
  2361. more overvalued the selected ad is. Again, this is the winner’s
  2362. curse.&lt;/p&gt;
  2363.  
  2364. &lt;p&gt;&lt;img src=&quot;/images/post_images/winners_curse/advertiser_selection.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
  2365.  
  2366. &lt;h2 id=&quot;the-external-auction&quot;&gt;The External Auction&lt;/h2&gt;
  2367.  
  2368. &lt;p&gt;Once we have decided which ad to select and what to bid for it, we
  2369. send that information to the ad exchange. The exchange runs another
  2370. auction, and based on the results, makes the final decision on which
  2371. ad to show. Unlike in our internal auction, we are not the
  2372. auctioneer in the external auction, which means we don’t have
  2373. complete information about the auction’s bids. This makes confidently
  2374. observing the winner’s curse more challenging.&lt;/p&gt;
  2375.  
  2376. &lt;p&gt;The first sign of the winner’s curse in the external auction is that
  2377. we consistently predict ad impressions to be slightly more valuable
  2378. than they turn out to be. To have confidence that these
  2379. overvaluations really are the winner’s curse, rather than an
  2380. undiscovered bug, we will need to understand a bit about the
  2381. interaction between our machine learning and the external auction.&lt;/p&gt;
  2382.  
  2383. &lt;p&gt;The key thing to understand is that the valuations predicted by our
  2384. machine learning system are very closely related to when we win the
  2385. external auction. In particular, when our models overvalue a
  2386. potential impression, we tend to bid higher, which causes us to win
  2387. the auction more often. In other words, we win a disproportionate
  2388. amount of the bids whose value we overestimate. This selection bias
  2389. is visualized below.&lt;/p&gt;
  2390.  
  2391. &lt;p&gt;&lt;img src=&quot;/images/post_images/winners_curse/selection_bias.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
  2392.  
  2393. &lt;p&gt;Fortunately, when we win the external auction and show an ad, we
  2394. receive feedback on the ad’s correct value. This, in turn, feeds
  2395. into our machine learning models as training data. As a result,
  2396. mistakes made by previous iterations of the model are corrected.
  2397. Unfortunately, no model is perfect, and our newer model iteration
  2398. will make new mistakes.&lt;/p&gt;
  2399.  
  2400. &lt;p&gt;The crucial insight for our investigation is that by comparing the
  2401. predictions (and mistakes) made by older and newer model iterations
  2402. over the same slice of backtesting data, we can learn whether the
  2403. overvaluations we observe are, in fact, caused by the winner’s
  2404. curse, or more menacingly, an undiscovered bug.&lt;/p&gt;
  2405.  
  2406. &lt;p&gt;To make this concrete, let’s say that we trained model A ten days
  2407. ago, and that we trained model B yesterday. Consider what we would
  2408. expect to happen if we used both models A and B to predict over
  2409. yesterday’s ad impression data that model B bid on and purchased.
  2410. If the winner’s curse were the cause of the overvaluations in the
  2411. external auction, we’d expect model B to overvalue these impressions
  2412. that it purchased because of the selection bias discussed above.
  2413. We’d also expect model A to value these impressions accurately.
  2414. This is because this selection bias affecting model B does not
  2415. apply to model A, since it was not used to bid on and purchase
  2416. these impressions.&lt;/p&gt;
  2417.  
  2418. &lt;p&gt;&lt;img src=&quot;/images/post_images/winners_curse/external_auction.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
  2419.  
  2420. &lt;p&gt;The plot above shows this experiment run over real data on 11
  2421. different days. As you can see, model A does not overvalue the ad
  2422. impressions purchased by model B. This means we really are
  2423. observing the winner’s curse in the external auction!&lt;/p&gt;
  2424.  
  2425. &lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
  2426.  
  2427. &lt;p&gt;You’re more likely to win an auction when you bid too high. That is
  2428. the essence of the winner’s curse, a surprisingly simple, yet
  2429. counter-intuitive, statistical effect. In this post, we learned how
  2430. the curse works in theory, and we observed it in some real data.&lt;/p&gt;
  2431.  
  2432. &lt;p&gt;Often, those of us interested in auctions ignore the fact that
  2433. bidders estimate the value of an item, assuming instead that they
  2434. know it exactly. However, bidders having uncertain valuations is all
  2435. that’s needed for the winner’s curse to appear. When the curse strikes,
  2436. buyers realize less profit than expected from the auctions in which
  2437. they participate. In extreme cases, this effect can even result in
  2438. the buyer losing money, as we saw with Bob and Alice. Beware of the
  2439. curse!&lt;/p&gt;
  2440.  
  2441. &lt;hr /&gt;
  2442. </description>
  2443.    </item>
  2444.    
  2445.    
  2446.    
  2447.    <item>
  2448.      <title>Ship it! with ecs-ship</title>
  2449.      <link>https://tech.nextroll.com/blog/dev/2021/02/23/ship-it-with-ecs-ship.html</link>
  2450.      <pubDate>Tue, 23 Feb 2021 00:00:00 -0800</pubDate>
  2451.      <author></author>
  2452.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2021/02/23/ship-it-with-ecs-ship</guid>
  2453.      <description>&lt;p&gt;Having a fast and reliable way to update your deployment environments is a game changer in
  2454. terms of team velocity and confidence. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; is a simple yet powerful tool that can
  2455. help your team ship faster.&lt;/p&gt;
  2456.  
  2457. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10 minute read&lt;/code&gt;&lt;/p&gt;
  2458.  
  2459. &lt;hr /&gt;
  2460.  
  2461. &lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;
  2462.  
  2463. &lt;p&gt;Most of our teams at NextRoll leverage the power of &lt;a href=&quot;https://aws.amazon.com/ecs/&quot; title=&quot;Elastic Container Service&quot;&gt;AWS ECS&lt;/a&gt; for container
  2464. orchestration. Each team manages their &lt;a href=&quot;https://aws.amazon.com/&quot; title=&quot;Amazon Web Services&quot;&gt;AWS&lt;/a&gt; resources, providing them with
  2465. ultimate freedom when choosing technologies or what or when to deploy. In this article,
  2466. we want to share &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt;, a small tool the likes of &lt;a href=&quot;https://github.com/fabfuel/ecs-deploy&quot; title=&quot;ECS Deploy&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-deploy&lt;/code&gt;&lt;/a&gt; that
  2467. we wrote to solve some of our common workflows and will soon be open source (Update: it’s now opensource yay!!)&lt;/p&gt;
  2468.  
  2469. &lt;h2 id=&quot;some-typical-workflows&quot;&gt;Some typical workflows&lt;/h2&gt;
  2470.  
  2471. &lt;p&gt;Let’s see by example which workflows you might want to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; for by looking at
  2472. two typical tasks we can accomplish with it:&lt;/p&gt;
  2473.  
  2474. &lt;ul&gt;
  2475.  &lt;li&gt;Deploying a new version of an application.&lt;/li&gt;
  2476.  &lt;li&gt;Changing the resource allocation of an application (aka right-sizing).&lt;/li&gt;
  2477. &lt;/ul&gt;
  2478.  
  2479. &lt;h3 id=&quot;deploying-with-ecs-ship&quot;&gt;Deploying with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt;&lt;/h3&gt;
  2480.  
  2481. &lt;p&gt;When deploying a new version of a containerized application to &lt;a href=&quot;https://aws.amazon.com/ecs/&quot; title=&quot;Elastic Container Service&quot;&gt;AWS ECS&lt;/a&gt;, there
  2482. are three resources you need to account for: container images, ECS task definitions, and
  2483. ECS services. &lt;a href=&quot;https://www.docker.com/resources/what-container&quot; title=&quot;What&apos;s a container&quot;&gt;&lt;em&gt;Container images&lt;/em&gt;&lt;/a&gt; are a pretty common artifact these
  2484. days. They represent a “unit” containing your application and all the
  2485. dependencies it might need to run correctly. &lt;a href=&quot;https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definitions.html&quot; title=&quot;Task definitions&quot;&gt;&lt;em&gt;ECS Task definitions&lt;/em&gt;&lt;/a&gt;
  2486. represent a containerized workload; essentially, they let ECS know what’s required to run
  2487. your tasks. For example, &lt;em&gt;I need three containers running my application image and 1 running
  2488. a database image&lt;/em&gt;. You express these requirements via container definitions inside the task
  2489. definition. Finally, &lt;a href=&quot;https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs_services.html&quot; title=&quot;ECS Services&quot;&gt;&lt;em&gt;ECS services&lt;/em&gt;&lt;/a&gt; reference a version of a task
  2490. definition and are responsible for letting AWS know “we do need these containers running
  2491. now”.&lt;/p&gt;
  2492.  
  2493. &lt;p&gt;In this framework, to deploy a new version of your application, you need to follow these
  2494. steps:&lt;/p&gt;
  2495.  
  2496. &lt;ul&gt;
  2497.  &lt;li&gt;Create and push a new version of your application’s image to a registry.&lt;/li&gt;
  2498.  &lt;li&gt;Create a new version of your task definition that references the freshly uploaded image.&lt;/li&gt;
  2499.  &lt;li&gt;Finally, update your service to the newly created task definition.&lt;/li&gt;
  2500. &lt;/ul&gt;
  2501.  
  2502. &lt;p&gt;Let’s go through them step by step.&lt;/p&gt;
  2503.  
  2504. &lt;p&gt;Building and pushing your images to a registry would usually look like this:&lt;/p&gt;
  2505.  
  2506. &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;VERSION&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;git rev-parse HEAD&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;
  2507. docker build &lt;span class=&quot;nt&quot;&gt;-t&lt;/span&gt; your.registry.io/yourusername/yourapp:&lt;span class=&quot;nv&quot;&gt;$VERSION&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;
  2508. docker push your.registry.io/yourusername/yourapp:&lt;span class=&quot;nv&quot;&gt;$VERSION&lt;/span&gt;
  2509. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  2510.  
  2511. &lt;p&gt;Once you have your new image (or images) up on the registry and ready to use, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt;
  2512. comes in to help you with the rest:&lt;/p&gt;
  2513.  
  2514. &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;EOF&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt; | ecs-ship your-cluster yourapp-service
  2515. containerDefinitions:
  2516.  yourapp-definition:
  2517.    image: your.registry.io/yourusername/yourapp:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$VERSION&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;
  2518.    environment:
  2519.      VERSION: &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$VERSION&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;
  2520.      SHIPPED_BY: ecs-ship
  2521. &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOF
  2522. &lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  2523.  
  2524. &lt;p&gt;As you can see, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; takes the cluster you want to update and the service you want
  2525. to update in that cluster. Furthermore, it takes a configuration file to
  2526. non-destructively modify some aspects of the chosen service, including the container
  2527. definitions. Here inside the container definition, we have decided to update the image to
  2528. refer to the new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;VERSION&lt;/code&gt; and upserted a couple of environment variables to later check on
  2529. the task definition. You can check the &lt;a href=&quot;https://github.com/AdRoll/ecs-ship#usage&quot; title=&quot;ecs-ship usage&quot;&gt;README on ecs-ship’s repo&lt;/a&gt; to get
  2530. familiar with which aspects of ecs services you can change.&lt;/p&gt;
  2531.  
  2532. &lt;p&gt;Running this command will perform these steps for you:&lt;/p&gt;
  2533.  
  2534. &lt;ul&gt;
  2535.  &lt;li&gt;Pull the task definition of the specified &lt;em&gt;cluster&lt;/em&gt; and &lt;em&gt;service&lt;/em&gt;.&lt;/li&gt;
  2536.  &lt;li&gt;Create a copy of this task definition performing the updates you requested on the input
  2537. file (only if anything is updated).&lt;/li&gt;
  2538.  &lt;li&gt;Update the specified &lt;em&gt;service&lt;/em&gt; to point to the newly created version of the task
  2539. definition.&lt;/li&gt;
  2540.  &lt;li&gt;Wait for your service to reflect the changes you requested correctly.&lt;/li&gt;
  2541. &lt;/ul&gt;
  2542.  
  2543. &lt;p&gt;If for some reason, your service doesn’t reach stability with your new configuration,
  2544. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; will automatically:&lt;/p&gt;
  2545.  
  2546. &lt;ul&gt;
  2547.  &lt;li&gt;Revert to the last task definition (if it was stable).&lt;/li&gt;
  2548.  &lt;li&gt;De-register the new bugged task definition.&lt;/li&gt;
  2549. &lt;/ul&gt;
  2550.  
  2551. &lt;h3 id=&quot;right-sizing-your-services&quot;&gt;Right-sizing your services&lt;/h3&gt;
  2552.  
  2553. &lt;p&gt;We just saw how &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; would help you in deploying new versions of ECS services. Now
  2554. let’s explore another simple yet powerful &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; that can help you change the
  2555. number of resources a particular service requests from your ECS cluster.&lt;/p&gt;
  2556.  
  2557. &lt;p&gt;One essential part of your ECS service configuration is the amount of resources a given
  2558. container or a given service requires. These resources are controlled by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cpu&lt;/code&gt;,
  2559. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memory&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memoryReservation&lt;/code&gt; settings. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cpu&lt;/code&gt; represents the amount of “shares” of a
  2560. virtual CPU your service needs. Simultaneously, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memory&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memoryReservation&lt;/code&gt; are hard and
  2561. soft caps to the amount of memory your service can use, respectively. These variables
  2562. cost you money in the following sense: if the sum of either the memory or the cpu
  2563. required by the services running in your cluster surpasses the number of actual
  2564. resources in the cluster, you need to spin up a new instance incurring in some
  2565. new costs.&lt;/p&gt;
  2566.  
  2567. &lt;p&gt;However, not always do your services use as many resources as you reserve for them, and
  2568. since you’re charged for the amount of resources you do request, it is a good idea to
  2569. keep those as close as possible to the actual usage values. To work out the
  2570. amount of resources your ECS services need, you can enable and monitor &lt;a href=&quot;https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch-metrics.html&quot; title=&quot;CloudWatch metrics&quot;&gt;CloudWatch
  2571. metrics&lt;/a&gt;, and once you have worked out the number of resources you need,
  2572. you can update the configuration using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt;:&lt;/p&gt;
  2573.  
  2574. &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;EOF&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt; | ecs-ship your-cluster yourapp-service
  2575. containerDefinitions:
  2576.  yourapp-definition:
  2577.    cpu: 123
  2578.    memoryReservation: 123
  2579. &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOF
  2580. &lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  2581.  
  2582. &lt;p&gt;As a side note, if you specify &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memory&lt;/code&gt;, you need to make sure the value is
  2583. comfortably higher than &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memoryReservation&lt;/code&gt;.&lt;/p&gt;
  2584.  
  2585. &lt;h2 id=&quot;how-ecs-ship-came-to-be&quot;&gt;How &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; came to be&lt;/h2&gt;
  2586.  
  2587. &lt;p&gt;Those familiar with &lt;a href=&quot;https://github.com/fabfuel/ecs-deploy&quot; title=&quot;ECS Deploy&quot;&gt;fabfuel’s ecs-deploy&lt;/a&gt; will notice that there’s no much
  2588. difference between it and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt;. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; can be seen as a new take on the
  2589. already popular project, with just a few different opinions. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; focuses only on
  2590. updating a given service with a new version of its task definition, leaving out other
  2591. add-on features like slack reporting or newrelic metrics for deployments.&lt;/p&gt;
  2592.  
  2593. &lt;p&gt;In terms of new features, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; can change resource allocations and
  2594. could be extended to change other aspects of a task definition. Furthermore, and we will
  2595. see an application of this on a further post, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt;’s input file-based API makes it
  2596. easier to interoperate with other third-party or homebrewed tools.&lt;/p&gt;
  2597.  
  2598. &lt;p&gt;Besides these minor differences, one of the primary motivations for building &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt;
  2599. was our bi-annual &lt;a href=&quot;https://tech.nextroll.com/blog/culture/2019/11/26/hackweek-at-nextroll.html&quot; title=&quot;Hack Week&quot;&gt;HackWeek&lt;/a&gt; events. These events give us some time to explore
  2600. technologies, work on a little side-project or solve issues in our workflow that are not
  2601. directly related to our backlog. In our case, with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt;, we experimented with
  2602. &lt;a href=&quot;https://golang.org/&quot; title=&quot;GO Programming Language&quot;&gt;golang&lt;/a&gt; (our team working mostly with Python and JavaScript) and solved the
  2603. need to update resource allocations for our services.&lt;/p&gt;
  2604.  
  2605. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  2606.  
  2607. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; is a little tool you can use either as a static linked binary or as a
  2608. containerized tool to manage your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ECS&lt;/code&gt; deployments from your local machine or your
  2609. continuous delivery pipelines.&lt;/p&gt;
  2610.  
  2611. &lt;h2 id=&quot;update-2021-04-08&quot;&gt;Update (2021-04-08)&lt;/h2&gt;
  2612.  
  2613. &lt;p&gt;As of yesterday &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ecs-ship&lt;/code&gt; is now open source! You can grab the source code, binaries
  2614. and docker images following the instructions at &lt;a href=&quot;https://github.com/AdRoll/ecs-ship#usage&quot; title=&quot;ecs-ship usage&quot;&gt;https://github.com/AdRoll/ecs-ship&lt;/a&gt;.&lt;/p&gt;
  2615.  
  2616. </description>
  2617.    </item>
  2618.    
  2619.    
  2620.    
  2621.    <item>
  2622.      <title>rebar3_hank: The Erlang Dead Code Cleaner</title>
  2623.      <link>https://tech.nextroll.com/blog/dev/2021/01/06/erlang-rebar3-hank.html</link>
  2624.      <pubDate>Wed, 06 Jan 2021 00:00:00 -0800</pubDate>
  2625.      <author></author>
  2626.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2021/01/06/erlang-rebar3-hank</guid>
  2627.      <description>&lt;p&gt;From the creators of &lt;a href=&quot;https://tech.nextroll.com/blog/dev/2020/02/25/erlang-rebar3-format.html&quot;&gt;rebar3_format&lt;/a&gt;, here comes… &lt;a href=&quot;https://github.com/AdRoll/rebar3_hank&quot;&gt;rebar3_hank&lt;/a&gt;, a powerful but simple tool to detect dead code around your Erlang codebase (and kill it with fire!).&lt;/p&gt;
  2628.  
  2629. &lt;p&gt;Developers can use this rebar plugin in addition to a linter (&lt;a href=&quot;https://github.com/inaka/elvis_core&quot;&gt;Elvis&lt;/a&gt;), &lt;a href=&quot;https://erlang.org/doc/apps/tools/xref_chapter.html&quot;&gt;Xref&lt;/a&gt;, and &lt;a href=&quot;https://erlang.org/doc/man/dialyzer.html&quot;&gt;Dialyzer&lt;/a&gt;; they complement each other perfectly.&lt;/p&gt;
  2630.  
  2631. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10 minute read&lt;/code&gt;&lt;/p&gt;
  2632.  
  2633. &lt;hr /&gt;
  2634.  
  2635. &lt;center&gt;
  2636.    &lt;img alt=&quot;Hank Scorpio - Kill it with fire!&quot; src=&quot;/images/post_heroes/hank_scorpio.jpg&quot; /&gt;&lt;br /&gt;
  2637.    &lt;i&gt;Hank Scorpio - Kill it with fire!&lt;/i&gt;
  2638. &lt;/center&gt;
  2639.  
  2640. &lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;
  2641. &lt;p&gt;In NextRoll’s RTB Team, we have two passions while we maintain our codebases: killing dead code and automating things! That’s why we thought that a tool for automating this process would be handy for the community and us. So, we started thinking more seriously about &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Hank&lt;/code&gt;, and we decided to spend our Winter HackWeek time to make it possible!&lt;/p&gt;
  2642.  
  2643. &lt;p&gt;Nobody wants to maintain dead code. In fact, most of us are huge fans of negative PRs. Hank can help you with that by traversing your project, analyzing every .erl and .hrl file in it (optionally skipping some folders/files if you want to), applying the rules, and producing a list of all the code that you can effectively delete and/or refactor.&lt;/p&gt;
  2644.  
  2645. &lt;p&gt;The best thing is that you can be sure that the dead code is, in fact, &lt;em&gt;dead&lt;/em&gt; since Hank is built with Dialyzer levels of certainty™️.&lt;/p&gt;
  2646.  
  2647. &lt;h2 id=&quot;differences-between-hank-and-other-existing-tools&quot;&gt;Differences between Hank and other existing tools&lt;/h2&gt;
  2648.  
  2649. &lt;p&gt;You might be thinking: Why Hank? That’s a job for my linter!&lt;/p&gt;
  2650.  
  2651. &lt;p&gt;The answer is: No.&lt;/p&gt;
  2652.  
  2653. &lt;p&gt;We use &lt;a href=&quot;https://github.com/inaka/elvis_core&quot;&gt;Elvis&lt;/a&gt; for code linting; it reviews our Erlang code style like function naming, nesting level, line length, variable naming convention, etc.&lt;/p&gt;
  2654.  
  2655. &lt;p&gt;Hank doesn’t do that.&lt;/p&gt;
  2656.  
  2657. &lt;p&gt;&lt;a href=&quot;https://erlang.org/doc/apps/tools/xref_chapter.html&quot;&gt;Xref&lt;/a&gt; is a cross-reference tool that can be used for finding dependencies between functions, modules, applications, and releases. It does so by analyzing the defined functions and function calls.
  2658. So it will warn us about a defined function that is never used around our source code.&lt;/p&gt;
  2659.  
  2660. &lt;p&gt;Hank doesn’t do that either.&lt;/p&gt;
  2661.  
  2662. &lt;p&gt;And &lt;a href=&quot;https://erlang.org/doc/man/dialyzer.html&quot;&gt;Dialyzer&lt;/a&gt;?
  2663. Dialyzer is a static analysis tool that identifies software discrepancies, such as success type errors, code that has become dead or unreachable because of a programming error, and unnecessary tests, among other things.
  2664. It bases its analysis on the concept of success typings.&lt;/p&gt;
  2665.  
  2666. &lt;p&gt;Hank does not rely on specs nor evaluates the “semantics” in functions params/returns.&lt;/p&gt;
  2667.  
  2668. &lt;h3 id=&quot;so-what-exactly-does-hank-do&quot;&gt;So what exactly does Hank do?&lt;/h3&gt;
  2669.  
  2670. &lt;p&gt;Hank will detect and warn you about &lt;em&gt;valid&lt;/em&gt; parts of your code that could &lt;em&gt;potentially&lt;/em&gt; be deleted or at least refactored based on rules.&lt;/p&gt;
  2671.  
  2672. &lt;p&gt;It works on entire projects (as opposed to Elvis, which works on individual files), on source code (as opposed to Xref, which works on compiled code), and on individual projects (as opposed to Dialyzer, which analyzes entire systems - including OTP and your dependencies).&lt;/p&gt;
  2673.  
  2674. &lt;p&gt;The current version while writing this post is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0.2.1&lt;/code&gt;. It’s a &lt;em&gt;minor&lt;/em&gt; version, but we’re already using it in our systems, and it’s practically ready for production usage.&lt;/p&gt;
  2675.  
  2676. &lt;hr /&gt;
  2677.  
  2678. &lt;h2 id=&quot;how-to-use-rebar3_hank&quot;&gt;How to use rebar3_hank&lt;/h2&gt;
  2679.  
  2680. &lt;p&gt;Just add this to your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; (either in your project or globally in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/.config/rebar3/rebar.config&lt;/code&gt;):&lt;/p&gt;
  2681.  
  2682. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plugins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rebar3_hank&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2683.  
  2684. &lt;p&gt;Then run…&lt;/p&gt;
  2685.  
  2686. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 hank&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2687.  
  2688. &lt;p&gt;…and kill it with fire!&lt;/p&gt;
  2689.  
  2690. &lt;h3 id=&quot;ignoring-rules&quot;&gt;Ignoring rules&lt;/h3&gt;
  2691.  
  2692. &lt;p&gt;There are cases where you need to ignore some rules, like when developing libraries, where you can define hrls or modules which will be consumed by other projects. In those cases, you’ll possibly need to ignore some rules (like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;single_use_hrl_attributes&lt;/code&gt;). Something similar happens when using Xref.&lt;/p&gt;
  2693.  
  2694. &lt;p&gt;For this purpose, you can ignore hank at the module level:&lt;/p&gt;
  2695.  
  2696. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;c&quot;&gt;% ignoring all the rules for this module
  2697. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;hank&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ignore&lt;/span&gt;
  2698.  
  2699. &lt;span class=&quot;c&quot;&gt;% or ignoring specific rules
  2700. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;hank&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;single_use_hrl_attributes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2701.  
  2702. &lt;p&gt;Or add this configuration in your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt;:&lt;/p&gt;
  2703.  
  2704. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hank&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ignore&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  2705.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;test/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unused_ignored_function_params&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  2706. &lt;span class=&quot;p&quot;&gt;]}]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2707.  
  2708. &lt;hr /&gt;
  2709.  
  2710. &lt;h2 id=&quot;the-rules&quot;&gt;The Rules&lt;/h2&gt;
  2711.  
  2712. &lt;p&gt;Here you can see the rules we’ve already created, and you can use them with Hank directly.&lt;/p&gt;
  2713.  
  2714. &lt;h3 id=&quot;unused_ignored_function_params&quot;&gt;unused_ignored_function_params&lt;/h3&gt;
  2715. &lt;p&gt;Functions evolve, and some parameters that were used before may no longer be needed. A typical easy solution could be just ignoring them and forgetting about the issue.&lt;/p&gt;
  2716.  
  2717. &lt;p&gt;Hank detects ignored parameters in the same position for all function clauses and lets you know that you can delete those parameters and refactor the places where the function is invoked, thus making your code cleaner. 😉&lt;/p&gt;
  2718.  
  2719. &lt;p&gt;For instance, when analyzing this module…&lt;/p&gt;
  2720.  
  2721. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  2722.  
  2723. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;export&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;external_fun&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  2724.  
  2725. &lt;span class=&quot;nf&quot;&gt;external_fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  2726.    &lt;span class=&quot;nf&quot;&gt;multi_fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;uniform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;undefined&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  2727.  
  2728. &lt;span class=&quot;c&quot;&gt;%% A multi-clause function with unused 3rd param
  2729. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;multi_fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;undefined&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  2730.    &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  2731. &lt;span class=&quot;nf&quot;&gt;multi_fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Arg2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;is_binary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  2732.    &lt;span class=&quot;nv&quot;&gt;Arg2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  2733. &lt;span class=&quot;nf&quot;&gt;multi_fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  2734.    &lt;span class=&quot;nv&quot;&gt;Arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2735.  
  2736. &lt;p&gt;Hank will output…&lt;/p&gt;
  2737.  
  2738. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;rebar3 hank
  2739. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; Looking &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;code to &lt;span class=&quot;nb&quot;&gt;kill &lt;/span&gt;with fire...
  2740. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; The following pieces of code are dead and should be removed:
  2741. src/my_module.erl:9: Param &lt;span class=&quot;c&quot;&gt;#3 is not used at &apos;multi_fun/3&apos;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2742.  
  2743. &lt;p&gt;To avoid this warning, remove the unused parameter(s).&lt;/p&gt;
  2744.  
  2745. &lt;h3 id=&quot;single_use_hrls&quot;&gt;single_use_hrls&lt;/h3&gt;
  2746. &lt;p&gt;Sometimes you put some code in a header file that’s supposed to be shared among multiple modules, but you end up writing just one module that uses it. In this case, it would be better to directly put the header file’s contents in the module itself. And Hank has a rule for that!&lt;/p&gt;
  2747.  
  2748. &lt;p&gt;Assuming header.hrl:&lt;/p&gt;
  2749.  
  2750. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;define&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;APP_HEADER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;this is a header from an app that will be used in just one module&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  2751. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;define&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;SOME_MACRO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2752.  
  2753. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;app_include_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  2754.  
  2755. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;include&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;header.hrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  2756.  
  2757. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;export&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my_function&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  2758.  
  2759. &lt;span class=&quot;nf&quot;&gt;my_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  2760.  &lt;span class=&quot;c&quot;&gt;% those are only used here!
  2761. &lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;SOME_MACRO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;APP_HEADER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2762.  
  2763. &lt;p&gt;It will output:&lt;/p&gt;
  2764.  
  2765. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;rebar3 hank
  2766. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; Looking &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;code to &lt;span class=&quot;nb&quot;&gt;kill &lt;/span&gt;with fire...
  2767. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; The following pieces of code are dead and should be removed:
  2768. header.hrl:0: This header file is only included at: src/app_include_lib.erl&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2769.  
  2770. &lt;p&gt;Move the hrl file’s contents directly to the module that uses them, and you’ll not see this warning again.&lt;/p&gt;
  2771.  
  2772. &lt;p&gt;See a complete example &lt;a href=&quot;https://github.com/AdRoll/rebar3_hank/tree/main/priv/test_files/single_use_hrls&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  2773.  
  2774. &lt;h3 id=&quot;single_use_hrl_attrs&quot;&gt;single_use_hrl_attrs&lt;/h3&gt;
  2775. &lt;p&gt;Sometimes it’s more subtle, tho. It’s not that the whole file is used in just one module; it is shared among many modules. But some attributes (like macros or records) are not. They are defined in the header file but only used in a single module. Hank has a rule that will suggest you to place those attributes inside the module to limit the amount of stuff that’s shared unnecessarily.&lt;/p&gt;
  2776.  
  2777. &lt;p&gt;Given the previous files and including the hrl in another file:&lt;/p&gt;
  2778.  
  2779. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;app_include_lib_2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  2780.  
  2781. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;include&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;header.hrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2782.  
  2783. &lt;p&gt;It will output:&lt;/p&gt;
  2784.  
  2785. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;rebar3 hank
  2786. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; Looking &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;code to &lt;span class=&quot;nb&quot;&gt;kill &lt;/span&gt;with fire...
  2787. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; The following pieces of code are dead and should be removed:
  2788. include/header.hrl:2: ?SOME_MACRO/1 is used only at src/app_include_lib.erl&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2789.  
  2790. &lt;p&gt;See a complete example &lt;a href=&quot;https://github.com/AdRoll/rebar3_hank/tree/main/priv/test_files/single_use_hrl_attrs/lib/app&quot;&gt;here&lt;/a&gt;&lt;/p&gt;
  2791.  
  2792. &lt;h3 id=&quot;unused_hrls&quot;&gt;unused_hrls&lt;/h3&gt;
  2793. &lt;p&gt;Sometimes the situation is even worse: You might have hrl files that are not included in any module. Hank will detect those and let you know that you can remove them entirely since they’re virtually useless.&lt;/p&gt;
  2794.  
  2795. &lt;p&gt;Adding a header_2.hrl file which is not included, the output will be:&lt;/p&gt;
  2796.  
  2797. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;rebar3 hank
  2798. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; Looking &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;code to &lt;span class=&quot;nb&quot;&gt;kill &lt;/span&gt;with fire...
  2799. &lt;span class=&quot;o&quot;&gt;===&amp;gt;&lt;/span&gt; The following pieces of code are dead and should be removed:
  2800. include/header_2.hrl:0: This file is unused&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2801.  
  2802. &lt;p&gt;See an example &lt;a href=&quot;https://github.com/AdRoll/rebar3_hank/tree/main/priv/test_files/unused_hrls/lib&quot;&gt;here&lt;/a&gt;&lt;/p&gt;
  2803.  
  2804. &lt;p&gt;It’s worth mentioning that &lt;a href=&quot;https://erlang-ls.github.io/&quot;&gt;erlang-ls&lt;/a&gt; already provides a similar functionality.&lt;/p&gt;
  2805. &lt;h3 id=&quot;unused_macros&quot;&gt;unused_macros&lt;/h3&gt;
  2806. &lt;p&gt;Hank also has a rule that will detect unused macros around the project. Those macros could be defined in any file within the source code but used in none of them. Therefore, they are effectively unnecessary and can be deleted.&lt;/p&gt;
  2807.  
  2808. &lt;p&gt;See an example &lt;a href=&quot;https://github.com/AdRoll/rebar3_hank/tree/main/priv/test_files/unused_macros&quot;&gt;here&lt;/a&gt;&lt;/p&gt;
  2809.  
  2810. &lt;h3 id=&quot;unused_record_fields&quot;&gt;unused_record_fields&lt;/h3&gt;
  2811. &lt;p&gt;A fascinating one! With this rule, Hank will spot record declarations with fields that are defined (even giving them default values) but never used. Hank considers that you are &lt;em&gt;using&lt;/em&gt; a record field when it is accessed or written.&lt;/p&gt;
  2812.  
  2813. &lt;p&gt;You can use this warning to reduce your records’ size by removing the unused fields from them.&lt;/p&gt;
  2814.  
  2815. &lt;p&gt;See an example &lt;a href=&quot;https://github.com/AdRoll/rebar3_hank/tree/main/priv/test_files/unused_record_fields&quot;&gt;here&lt;/a&gt;&lt;/p&gt;
  2816.  
  2817. &lt;hr /&gt;
  2818.  
  2819. &lt;h2 id=&quot;extensibility&quot;&gt;Extensibility&lt;/h2&gt;
  2820.  
  2821. &lt;p&gt;Following the lead of Elvis and rebar3_format, we built this project with extensibility in mind. Anybody can write their own rules for their projects by just implementing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hank_rule&lt;/code&gt; behavior.&lt;/p&gt;
  2822.  
  2823. &lt;p&gt;But if you feel like sharing your new rules with the world, we are eager to get community contributions in the &lt;a href=&quot;https://github.com/AdRoll/rebar3_hank&quot;&gt;rebar3_hank GitHub&lt;/a&gt;! Check out the &lt;a href=&quot;https://github.com/AdRoll/rebar3_hank/issues&quot;&gt;open issues&lt;/a&gt;, and feel free to open new ones!
  2824. You can also use the &lt;a href=&quot;https://github.com/AdRoll/rebar3_hank/discussions&quot;&gt;discussions page&lt;/a&gt; to get in touch with us.&lt;/p&gt;
  2825.  
  2826. &lt;hr /&gt;
  2827.  
  2828. &lt;h2 id=&quot;testing-hanks-power&quot;&gt;Testing Hank’s Power&lt;/h2&gt;
  2829.  
  2830. &lt;p&gt;To see how powerful Hank was, we decided to test it in a very large codebase.&lt;/p&gt;
  2831.  
  2832. &lt;p&gt;We decided to try with &lt;a href=&quot;https://github.com/erlang/otp&quot;&gt;Erlang/OTP&lt;/a&gt; itself. Since it’s mainly composed by libraries, we had to limit the rules to apply to avoid some bogus results. We used this configuration:&lt;/p&gt;
  2833.  
  2834. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hank&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  2835.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ignore&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;**/test/**&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;%% Just &quot;production&quot; code, no tests
  2836. &lt;/span&gt;    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rules&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  2837.        &lt;span class=&quot;n&quot;&gt;unused_ignored_function_params&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  2838.        &lt;span class=&quot;n&quot;&gt;unused_hrls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  2839.        &lt;span class=&quot;n&quot;&gt;unused_macros&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  2840.        &lt;span class=&quot;n&quot;&gt;unused_record_fields&lt;/span&gt;
  2841.    &lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  2842. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2843.  
  2844. &lt;p&gt;We hoped to find a large number of warnings, but never as large as what we found. Hank found &lt;strong&gt;more than 4000 pieces&lt;/strong&gt; of dead code in OTP’s &lt;strong&gt;production code&lt;/strong&gt; (i.e., we didn’t check the tests).&lt;/p&gt;
  2845.  
  2846. &lt;p&gt;Indeed, not all of them are supposed to be removed, but to give you a taste of the stuff that Hank found, check out the following warnings…&lt;/p&gt;
  2847.  
  2848. &lt;h3 id=&quot;unused-fields-in-records&quot;&gt;Unused Fields in Records&lt;/h3&gt;
  2849.  
  2850. &lt;p&gt;Hank found 130 unused fields in records, like &lt;a href=&quot;https://github.com/erlang/otp/blob/6378a0c825db64df91e01ee39e3a268f4ba050b7/lib/syntax_tools/src/erl_tidy.erl#L954&quot;&gt;this one in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_tidy&lt;/code&gt;&lt;/a&gt; or &lt;a href=&quot;https://github.com/erlang/otp/blob/6378a0c825db64df91e01ee39e3a268f4ba050b7/lib/kernel/src/logger_server.erl#L45&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;remote_logger&lt;/code&gt; here&lt;/a&gt;.&lt;/p&gt;
  2851.  
  2852. &lt;h3 id=&quot;unused-macros&quot;&gt;Unused Macros&lt;/h3&gt;
  2853.  
  2854. &lt;p&gt;Hank found more than 1000 unused macros in OTP, most of them in large modules of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;megaco&lt;/code&gt; application and others like &lt;a href=&quot;https://github.com/erlang/otp/blob/6378a0c825db64df91e01ee39e3a268f4ba050b7/lib/xmerl/src/xmerl_uri.erl#L349&quot;&gt;this one in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;xmerl_uri&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
  2855.  
  2856. &lt;h3 id=&quot;unused-parameters&quot;&gt;Unused Parameters&lt;/h3&gt;
  2857.  
  2858. &lt;p&gt;Hank also found more than 2000 functions with unused params. Some of them are not actually errors, like &lt;a href=&quot;https://github.com/erlang/otp/blob/6378a0c825db64df91e01ee39e3a268f4ba050b7/erts/preloaded/src/erlang.erl#L370-L371&quot;&gt;this one&lt;/a&gt; that’s masking a NIF function (&lt;a href=&quot;https://github.com/AdRoll/rebar3_hank/issues/65&quot;&gt;Which will be fixed soon&lt;/a&gt;). But others are worth checking, like &lt;a href=&quot;https://github.com/erlang/otp/blob/6378a0c825db64df91e01ee39e3a268f4ba050b7/lib/inets/src/http_lib/http_uri.erl#L257-L266&quot;&gt;this non-exported function&lt;/a&gt; that never uses its first argument.&lt;/p&gt;
  2859. </description>
  2860.    </item>
  2861.    
  2862.    
  2863.    
  2864.    <item>
  2865.      <title>Chasing a Performance Regression with Erlang/OTP 22</title>
  2866.      <link>https://tech.nextroll.com/blog/dev/2020/11/03/chasing-a-perf-regression-erlang.html</link>
  2867.      <pubDate>Tue, 03 Nov 2020 00:00:00 -0800</pubDate>
  2868.      <author></author>
  2869.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2020/11/03/chasing-a-perf-regression-erlang</guid>
  2870.      <description>&lt;p&gt;Updating the underlying systems that our service depends on (be them operating system, VMs, core libraries, databases, or other components) is a regular part of our systems’ lifecycle. In this post, we’ll discuss how we found a performance regression when updating to a newer erlang OTP release, the steps we took to investigate it, and how we worked around the specific issue at hand.&lt;/p&gt;
  2871.  
  2872. &lt;h2 id=&quot;otp-update-cadence&quot;&gt;OTP Update Cadence&lt;/h2&gt;
  2873.  
  2874. &lt;p&gt;For a big part of our core tech running on Erlang, years ago we settled on a policy to run on the second to last OTP release. The reasoning behind this was to avoid being subject to bugs or performance issues that might still exist on the freshest release. The idea was also to wait a bit till new features settle down before rushing to adopt them. It is not written in stone, but it is a rule we try to respect.&lt;/p&gt;
  2875.  
  2876. &lt;p&gt;Following this workflow, we first tried to migrate from OTP21 to OTP22 some months ago, as OTP23 was released. But much to our dismay, this wasn’t as smooth as we would have liked.&lt;/p&gt;
  2877.  
  2878. &lt;h2 id=&quot;performance-drop&quot;&gt;Performance Drop&lt;/h2&gt;
  2879. &lt;p&gt;As part of our release process, we ran the new release parallel with the previous one and looked for any anomaly or performance degradation that could impact us. In this case, it was immediately noticeable that we were hitting some problems when running on OTP22, as the service’s production throughput was lower.  &lt;em&gt;Noticeably&lt;/em&gt; lower: our system was running at ~0.75x of what it was before.&lt;/p&gt;
  2880.  
  2881. &lt;center&gt;
  2882.    &lt;img alg=&quot;madvise usage&quot; src=&quot;/images/post_images/otp22-throughput.png&quot; /&gt;&lt;br /&gt;
  2883.    &lt;i&gt;Throughput when switching to OTP22 (yellow) vs. throughput on our production system running OTP21 (blue) and a control group (purple)&lt;/i&gt;
  2884.    &lt;p&gt;&lt;/p&gt;
  2885. &lt;/center&gt;
  2886.  
  2887. &lt;p&gt;The first thing we tried was to look at the &lt;a href=&quot;http://erlang.org/download/otp_src_22.0.readme&quot;&gt;release notes&lt;/a&gt;. Was there any change that could have this kind of unexpected impact on our system? We were looking for things we thought could have this kind of significant impact, like…&lt;/p&gt;
  2888.  
  2889. &lt;ul&gt;
  2890.  &lt;li&gt;Changes to BEAM startup parameters and default values&lt;/li&gt;
  2891.  &lt;li&gt;Changes to socket/networking implementation&lt;/li&gt;
  2892.  &lt;li&gt;Changes related to literal terms (we have tons of data that is preprocessed and backed in &lt;a href=&quot;https://erlang.org/doc/man/persistent_term.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;persistent_term&lt;/code&gt;&lt;/a&gt; at startup time)&lt;/li&gt;
  2893. &lt;/ul&gt;
  2894.  
  2895. &lt;p&gt;Initially, nothing stood out as suspicious.&lt;/p&gt;
  2896.  
  2897. &lt;h2 id=&quot;the-plot-thickens&quot;&gt;The Plot Thickens&lt;/h2&gt;
  2898.  
  2899. &lt;p&gt;Then we looked at what the system was actually doing, and it was evident that it was spending much more time on system-cpu than before:&lt;/p&gt;
  2900.  
  2901. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;ec2-user@ip-172-26-77-44 ~]&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;mpstat &lt;span class=&quot;nt&quot;&gt;-P&lt;/span&gt; ALL 5
  2902. Linux 4.14.177-107.254.amzn1.x86_64 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;ip-172-26-77-44&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;   06/08/2020  _x86_64_    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;16 CPU&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
  2903. 07:44:51 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
  2904. 07:44:56 PM  all   62.28    0.00   16.24    0.00    0.00    0.61    0.41    0.00   20.46
  2905. 07:44:56 PM    0   21.22    0.00    4.69    0.00    0.00    0.00    0.61    0.00   73.47
  2906. 07:44:56 PM    1   62.37    0.00   17.71    0.00    0.00    4.63    0.40    0.00   14.89
  2907. 07:44:56 PM    2   65.09    0.00   17.25    0.00    0.00    0.00    0.41    0.00   17.25
  2908. 07:44:56 PM    3   65.58    0.00   16.50    0.00    0.00    0.00    0.41    0.00   17.52
  2909. 07:44:56 PM    4   66.94    0.00   15.70    0.00    0.00    0.00    0.41    0.00   16.94
  2910. 07:44:56 PM    5   66.67    0.00   16.36    0.00    0.00    0.00    0.41    0.00   16.56
  2911. 07:44:56 PM    6   65.79    0.00   16.70    0.00    0.00    0.00    0.40    0.00   17.10
  2912. 07:44:56 PM    7   68.78    0.00   14.69    0.00    0.00    0.00    0.20    0.00   16.33
  2913. 07:44:56 PM    8   64.62    0.00   15.95    0.00    0.00    0.00    0.61    0.00   18.81
  2914. 07:44:56 PM    9   67.69    0.00   14.93    0.00    0.00    0.00    0.41    0.00   16.97
  2915. 07:44:56 PM   10   53.54    0.00   24.65    0.00    0.00    4.85    0.40    0.00   16.57
  2916. 07:44:56 PM   11   61.89    0.00   19.47    0.00    0.00    0.00    0.41    0.00   18.24
  2917. 07:44:56 PM   12   62.68    0.00   19.38    0.00    0.00    0.00    0.62    0.00   17.32
  2918. 07:44:56 PM   13   64.21    0.00   17.79    0.00    0.00    0.00    0.61    0.00   17.38
  2919. 07:44:56 PM   14   74.49    0.00   10.93    0.00    0.00    0.00    0.40    0.00   14.17
  2920. 07:44:56 PM   15   64.24    0.00   17.67    0.00    0.00    0.00    0.42    0.00   17.67&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2921.  
  2922. &lt;p&gt;It’s clear from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;%sys&lt;/code&gt; column that more than 15% of the CPU time is in kernel tasks across all cores (core 0 is irrelevant to our discussion here since our erlang system doesn’t use it; it’s reserved for other short-lived time-sensitive tasks). The system also had a high % of idle time, so it was clear that it was waiting for &lt;em&gt;something&lt;/em&gt;.&lt;/p&gt;
  2923.  
  2924. &lt;p&gt;Since kernel tasks where up, we compared vmstat output between the two systems. The number of interrupts and context switches was almost double on OTP22…&lt;/p&gt;
  2925.  
  2926. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;ec2-user@ip-172-26-77-44 ~]&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;vmstat 5
  2927. procs &lt;span class=&quot;nt&quot;&gt;-----------memory----------&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;---swap--&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-----io----&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--system--&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-----cpu-----&lt;/span&gt;
  2928. r  b   swpd   free   buff  cache   si   so    bi    bo   &lt;span class=&quot;k&quot;&gt;in   &lt;/span&gt;cs us sy &lt;span class=&quot;nb&quot;&gt;id &lt;/span&gt;wa st
  2929. 24  0      0 24724364  63472 1242540    0    0    12    57 1466  981 59  9 31  0  1
  2930. 19  0      0 24773600  63480 1244352    0    0     0    98 271077 62547 74 12 13  0  0
  2931. 15  0      0 24930996  63488 1246924    0    0     0   359 248553 61832 74 12 13  0  0
  2932. 15  0      0 24840760  63496 1247752    0    0     0  1405 250937 48011 74 12 13  0  0
  2933. 13  0      0 24858236  63504 1249876    0    0     0   664 247081 48183 75 12 12  0  0
  2934. 17  0      0 24855240  63512 1251484    0    0     0    35 252037 47864 76 12 12  0  0
  2935. 19  0      0 24911248  63520 1253176    0    0     0    18 243274 48386 75 12 13  0  0
  2936. 14  0      0 24803668  63528 1253780    0    0     0    22 236502 47642 76 11 12  0  0&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2937.  
  2938. &lt;p&gt;…than on OTP21:&lt;/p&gt;
  2939.  
  2940. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;ec2-user@ip-172-26-77-50 ~]&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;vmstat 5
  2941. procs &lt;span class=&quot;nt&quot;&gt;-----------memory----------&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;---swap--&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-----io----&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--system--&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-----cpu-----&lt;/span&gt;
  2942. r  b   swpd   free   buff  cache   si   so    bi    bo   &lt;span class=&quot;k&quot;&gt;in   &lt;/span&gt;cs us sy &lt;span class=&quot;nb&quot;&gt;id &lt;/span&gt;wa st
  2943. 16  0      0 22068060  63404 1410956    0    0     9    54 1218  513 70  8 21  0  2
  2944. 17  0      0 22078024  63412 1413436    0    0     0   674 124876 27720 84  7  9  0  0
  2945. 15  0      0 22120716  63412 1414376    0    0     0   267 119251 26972 84  7 10  0  0
  2946. 6  0      0 22107168  63420 1415960    0    0     0   602 117711 25838 85  7  8  0  0
  2947. 17  0      0 22124960  63428 1418760    0    0     0  1528 121784 26970 83  7 10  0  0
  2948. 12  0      0 22149468  63436 1420380    0    0     0    18 124217 30018 82  7 11  0  0
  2949. 15  0      0 22135184  63444 1422208    0    0     0   115 121552 27490 86  7  6  0  0&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2950.  
  2951. &lt;p&gt;Next, we looked at network activity: could it be that the pattern of socket writes/read was different?
  2952. Even if the data sent and received was the same, were we using more IP packets than before? Were those interrupts network interrupts? The answer to all of those was “NO”. The problem was not on the IO side.&lt;/p&gt;
  2953.  
  2954. &lt;p&gt;Instead, as &lt;a href=&quot;https://linux.die.net/man/1/sar&quot;&gt;System Activity Report&lt;/a&gt; told us, we had a &lt;em&gt;huge&lt;/em&gt; number of page faults:&lt;/p&gt;
  2955.  
  2956. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;sar &lt;span class=&quot;nt&quot;&gt;-B&lt;/span&gt; 10 10
  2957. Linux 4.14.177-107.254.amzn1.x86_64 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;ip-172-26-68-64&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;   08/03/2020  _x86_64_    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;16 CPU&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
  2958. 09:21:55 PM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s pgscand/s pgsteal/s    %vmeff
  2959. 09:22:05 PM      0.00    243.64 140891.62      0.00 128650.61      0.00      0.00      0.00      0.00
  2960. 09:22:15 PM      0.00    103.24 135143.62      0.00 129283.20      0.00      0.00      0.00      0.00&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2961.  
  2962. &lt;p&gt;Note that these were &lt;em&gt;minor&lt;/em&gt; page faults. We weren’t thrashing on disk, and we weren’t short on memory.&lt;/p&gt;
  2963.  
  2964. &lt;p&gt;With this new info, we went back to look at changelogs and release notes, this time, knowing a bit more about what we were looking for.
  2965. We were looking for any change in allocator settings/workings that could lead to what we were seeing.
  2966. The “Memory optimization” section at the very bottom of &lt;a href=&quot;http://blog.erlang.org/OTP-22-Highlights/&quot;&gt;this article&lt;/a&gt; caught our attention, and we thought we had a &lt;a href=&quot;https://github.com/erlang/otp/pull/2046&quot;&gt;winner&lt;/a&gt;. It seemed like a strong possibility that a change having to do with freeing memory could be related to our increasing page faults.&lt;/p&gt;
  2967.  
  2968. &lt;p&gt;At this point, we &lt;em&gt;rushed&lt;/em&gt; and made a quick and dirty patch disabling that new behavior (or that’s what we thought). We retried the test to confirm this was the cause before digging deeper into how to solve it in a more civilized way.  Adding to our frustration, this didn’t make a dent in the performance drop, so we decided this particular patch probably wasn’t our issue and continued digging elsewhere.&lt;/p&gt;
  2969.  
  2970. &lt;h2 id=&quot;git-bisect&quot;&gt;Git Bisect&lt;/h2&gt;
  2971. &lt;p&gt;While investigating this early on, one problem we had was that we have good, fast ways to generate new releases of our systems based on a given (fixed) OTP release. But generating releases using different OTP versions was a painful and tedious process. This is simply something that our dev infra hadn’t been optimized for. We had to change several configuration and packing files, as well as redeploy our CI pipelines.&lt;/p&gt;
  2972.  
  2973. &lt;p&gt;This hadn’t been a problem so far as switching OTP releases is usually a once-in-a-year thing, but now it was slowing us down as we wanted to try different OTP versions (we tried other minor releases of the OTP22 family, for instance) as well as BEAM patches. So we went on and solved this issue, making it possible for us to build and test deploy releases with different underlying OTP versions easily and fast, even in parallel.&lt;/p&gt;
  2974.  
  2975. &lt;p&gt;Armed with this new tool, we went with the brute-force approach: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git-bisect&lt;/code&gt; the changes between OTP21 and OTP23, until we found the commit where our problem was introduced. The investigation was surprisingly fast to do, and after one afternoon, we saw what looked like the smoking gun.  Surprisingly it brought us back to &lt;a href=&quot;https://github.com/erlang/otp/pull/2046&quot;&gt;OTP’s PR #2046&lt;/a&gt;. It was a bit confusing; hadn’t we ruled that out already?&lt;/p&gt;
  2976.  
  2977. &lt;p&gt;Looking at the kind of memory-related syscalls we were doing on the newer BEAM, we saw:&lt;/p&gt;
  2978.  
  2979. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;strace &lt;span class=&quot;nt&quot;&gt;-f&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;trace&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;memory &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$pid&lt;/span&gt;
  2980. % &lt;span class=&quot;nb&quot;&gt;time     &lt;/span&gt;seconds  usecs/call     calls    errors sys-call
  2981. &lt;span class=&quot;nt&quot;&gt;------&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-----------&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-----------&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;---------&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;---------&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;----------------&lt;/span&gt;
  2982. 86.91    0.622994          74      8371           madvise
  2983.  7.39    0.053005          71       746           munmap
  2984.  5.70    0.040851          62       656           mmap&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2985.  
  2986. &lt;p&gt;Comparing that with the same tracing on OTP21, the difference was that there were &lt;em&gt;no&lt;/em&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise&lt;/code&gt; calls on OTP21, while the number of memory allocations was about the same.&lt;/p&gt;
  2987.  
  2988. &lt;p&gt;Where were we calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise&lt;/code&gt;?. We used &lt;a href=&quot;http://www.brendangregg.com/perf.html&quot;&gt;perf&lt;/a&gt; to find that out.&lt;/p&gt;
  2989.  
  2990. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;perf record &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; syscalls:sys_enter_madvise &lt;span class=&quot;nt&quot;&gt;-a&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-g&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;sleep &lt;/span&gt;10&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  2991.  
  2992. &lt;center&gt;
  2993.    &lt;img alg=&quot;madvise usage&quot; src=&quot;/images/post_images/madvise.png&quot; /&gt;&lt;br /&gt;
  2994.    &lt;i&gt;Snapshot of madvise calls, zooming in just one scheduler.&lt;/i&gt;
  2995.    &lt;p&gt;&lt;/p&gt;
  2996. &lt;/center&gt;
  2997.  
  2998. &lt;h2 id=&quot;the-madvise-patch&quot;&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise&lt;/code&gt; Patch&lt;/h2&gt;
  2999.  
  3000. &lt;p&gt;At this point, we were pretty sure &lt;a href=&quot;https://github.com/erlang/otp/pull/2046&quot;&gt;#2046&lt;/a&gt; was related to our problem, and so we did what we should have done the first time we suspected it: We tried to understand what the patch was doing and what &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise&lt;/code&gt; actually means.&lt;/p&gt;
  3001.  
  3002. &lt;p&gt;The description on the pull request states its intention pretty clearly:&lt;/p&gt;
  3003.  
  3004. &lt;blockquote&gt;
  3005.  &lt;p&gt;This PR lets the OS reclaim the physical memory associated with free blocks in pooled carriers, reducing the impact of long-lived awkward allocations. A small allocated block will still keep a huge carrier alive, but the carrier’s unused parts will now be available to the OS.&lt;/p&gt;
  3006. &lt;/blockquote&gt;
  3007.  
  3008. &lt;p&gt;We won’t explain here how the memory allocation is structured on the BEAM. For that, refer to the &lt;a href=&quot;https://erlang.org/doc/man/erts_alloc.html&quot;&gt;alloc framework docs&lt;/a&gt; and &lt;a href=&quot;https://erlang.org/doc/apps/erts/CarrierMigration.html&quot;&gt;carrier migration docs&lt;/a&gt;. There you’ll find an in-depth explanation of pooled carriers.&lt;/p&gt;
  3009.  
  3010. &lt;p&gt;&lt;a href=&quot;https://www.man7.org/linux/man-pages/man2/madvise.2.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise&lt;/code&gt;&lt;/a&gt;, the way it’s used here, is a &lt;em&gt;hint&lt;/em&gt; to the OS about the usage of a given block of memory. In particular, the BEAM (since OTP22) calls that function to tell the OS that some memory blocks are likely not going to be needed in the near future.&lt;/p&gt;
  3011.  
  3012. &lt;p&gt;Now that we understood what we were looking at, our performance measurements made sense. Our system was returning carriers to the pool too often and then picking them again shortly after. When returning the carriers to the pool, the new behavior on OTP22 was supposedly &lt;em&gt;hinting&lt;/em&gt; to the OS that the free memory there was not expected to be needed again soon.&lt;/p&gt;
  3013.  
  3014. &lt;p&gt;The question is, why did &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise (MADV_FREE)&lt;/code&gt; usage lead to page faults? It is supposed to be a &lt;em&gt;hint&lt;/em&gt; to the OS, so it should choose to unmap that memory over others in case of memory pressure. But we never were under memory pressure. As usual, &lt;a href=&quot;http://erlang.org/pipermail/erlang-questions/2020-August/099911.html&quot;&gt;Lukas was very helpful&lt;/a&gt;, suggesting that we check if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MADV_FREE&lt;/code&gt; was actually available on our system. We run on AWS, and we were still using an older version of Amazon Linux which doesn’t support &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MADV_FREE&lt;/code&gt;. So the BEAM falls back to using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MADV_DONTNEED&lt;/code&gt;, and in that case, the OS was &lt;em&gt;very&lt;/em&gt; eager to unmap these memory blocks from our processes, even when memory wasn’t needed elsewhere.&lt;/p&gt;
  3015.  
  3016. &lt;p&gt;If you want to check if your environment supports &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MADV_FREE&lt;/code&gt;, you can grep for it on .h headers provided by your OS. Or in a perhaps more civilized way, you can compile and run a test program like:&lt;/p&gt;
  3017.  
  3018. &lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;    &lt;span class=&quot;cp&quot;&gt;#include&lt;/span&gt; &lt;span class=&quot;cpf&quot;&gt;&amp;lt;sys/mman.h&amp;gt;&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt;
  3019. &lt;/span&gt;    &lt;span class=&quot;cp&quot;&gt;#include&lt;/span&gt; &lt;span class=&quot;cpf&quot;&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt;
  3020. &lt;/span&gt;    &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  3021.       &lt;span class=&quot;cp&quot;&gt;#ifdef MADV_FREE
  3022. &lt;/span&gt;           &lt;span class=&quot;n&quot;&gt;printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Congrats, MADV_FREE flag is available&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  3023.       &lt;span class=&quot;cp&quot;&gt;#else
  3024. &lt;/span&gt;           &lt;span class=&quot;n&quot;&gt;printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;:(&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  3025.       &lt;span class=&quot;cp&quot;&gt;#endif
  3026. &lt;/span&gt;       &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  3027.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  3028. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  3029.  
  3030. &lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; the newer Amazon Linux 2 &lt;em&gt;does&lt;/em&gt; support MADV_FREE.&lt;/p&gt;
  3031.  
  3032. &lt;h2 id=&quot;how-we-solved-it&quot;&gt;How we solved it&lt;/h2&gt;
  3033.  
  3034. &lt;p&gt;Why do we have such a repeated pattern of leaving and picking carriers constantly?
  3035. This is an interesting question, and it is, in fact, the root of the problem. It means our allocation setting wasn’t appropriate. The settings had not changed in years, while the system had evolved and grown quite a bit during that time. This &lt;em&gt;misusage&lt;/em&gt; of carrier pools was present when running on OTP21 as well, of course. It’s just that it was less taxed there.&lt;/p&gt;
  3036.  
  3037. &lt;p&gt;The solution was to go and improve them, a thing much easier said than done :) (if in doubt, a good starting point is to use &lt;a href=&quot;https://erlang.org/doc/man/erts_alloc_config.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erts_alloc_config&lt;/code&gt;&lt;/a&gt; to get an initial config and iterate from there).&lt;/p&gt;
  3038.  
  3039. &lt;p&gt;We did several iterations adjusting our settings and improved our memory handling and performance quite a bit, but couldn’t completely eliminate the overhead as carriers were still abandoned too often. This is something we need to investigate further. Carriers aren’t supposed to be pushed in and out of the pool at such a pace.  As a temporary workaround that would let us update to OTP22, we &lt;em&gt;disabled&lt;/em&gt; carrier migration entirely (this is done by setting the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;+M&amp;lt;S&amp;gt;acul&lt;/code&gt; system flag to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt;, replacing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;S&amp;gt;&lt;/code&gt; with the desired allocators you want to tweak).&lt;/p&gt;
  3040.  
  3041. &lt;h2 id=&quot;a-note-about-the-process&quot;&gt;A note about the process&lt;/h2&gt;
  3042.  
  3043. &lt;p&gt;As mentioned earlier, we did consider the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise&lt;/code&gt; change as a possible cause of our issue when we started to investigate it. So why did we rule it out when we were initially suspicious about it?&lt;/p&gt;
  3044.  
  3045. &lt;p&gt;As it turns out, that &lt;em&gt;quick-and-dirty little patch&lt;/em&gt; to disable it discussed above was &lt;em&gt;wrong&lt;/em&gt;, and it didn’t really disable the thing. This was a regrettable incident and revealed two errors on our side, both caused by rushing to advance on the investigation without taking time to ask &lt;em&gt;why&lt;/em&gt; things where the way they were:&lt;/p&gt;
  3046.  
  3047. &lt;ul&gt;
  3048.  &lt;li&gt;We rushed to do a quick patch to test the hypothesis without fully understanding the code at hand. We disabled &lt;em&gt;one&lt;/em&gt; code path calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise&lt;/code&gt; but missed another.&lt;/li&gt;
  3049.  &lt;li&gt;When the hypothesis failed, we again rushed to move on with another work. We didn’t take the time to analyze &lt;em&gt;why&lt;/em&gt; our theory was wrong. We didn’t question the experiment’s validity, which would have led us to realize our patch was wrong early on.&lt;/li&gt;
  3050. &lt;/ul&gt;
  3051.  
  3052. &lt;p&gt;There is a broader lesson here: When investigating any complex problem,  it’s easy to get overwhelmed by the number of moving parts to consider, but it is crucial not to rush to premature conclusions. You should take your time to understand the problem at hand and how each step moves you a bit closer to the resolution. Knowing what is &lt;em&gt;not&lt;/em&gt; the cause of the problem is good progress!&lt;/p&gt;
  3053. </description>
  3054.    </item>
  3055.    
  3056.    
  3057.    
  3058.    <item>
  3059.      <title>How NextRoll is Supporting Our Rollers During the COVID-19 Pandemic</title>
  3060.      <link>https://tech.nextroll.com/blog/culture/2020/07/28/supporting-rollers-during-covid.html</link>
  3061.      <pubDate>Tue, 28 Jul 2020 00:00:00 -0700</pubDate>
  3062.      <author></author>
  3063.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2020/07/28/supporting-rollers-during-covid</guid>
  3064.      <description>&lt;p&gt;Our Rollers (employees) are the most essential “ingredient” to our success and culture. This pandemic is like nothing any of us have been through before. So, as we navigate this time with our Rollers, we want to ensure they are being supported… but we can’t read their minds! The best way to understand their needs and concerns is to ask them via a survey, interpret the results, and then take action. That’s just what we did and will continue to do.&lt;/p&gt;
  3065.  
  3066. &lt;h1 id=&quot;ask-the-right-questions&quot;&gt;Ask the right questions&lt;/h1&gt;
  3067.  
  3068. &lt;p&gt;It’s important to ask the right questions to get Rollers’ sentiment and feedback. Since 2013, we have used &lt;a href=&quot;https://www.cultureamp.com/&quot;&gt;Culture Amp&lt;/a&gt;, an employee engagement platform, for almost all of our internal surveys. They have a library of research-backed surveys designed by organizational psychologists and data scientists. Think onboarding, exit, engagement, and diversity &amp;amp; inclusion. Recently, they have added surveys to support our current challenging environment, such as a COVID-19 response survey and return to the workplace survey. We use their off-the-shelf survey templates as a starting point and tailor them to include questions we want to ask. Our recent sentiment survey focused on how to best support Rollers through the changing environment, their sentiment on the future of remote work, and their return to workplace readiness.&lt;/p&gt;
  3069.  
  3070. &lt;p&gt;We always include a few open-ended questions too. They are important because they lead to longer, more insightful responses and not just one-word answers. Open-ended questions are great to use to gain understanding of the sentiments of employees as they tend to give more context and information on their thoughts and concerns.&lt;/p&gt;
  3071.  
  3072. &lt;h1 id=&quot;understand-the-results&quot;&gt;Understand the results&lt;/h1&gt;
  3073. &lt;p&gt;For our COVID-19 Sentiment Survey, we had a participation rate of 83% with almost 1,800 comments to read. Culture Amp’s platform helps to easily review and navigate the quantitative data. We were able to drill down by specific demographics like location and department. This was especially helpful when reviewing the return to workplace readiness questions. We found our US offices had a 50% or higher sentiment of not wanting to return to the office until a vaccine or therapeutic becomes available.&lt;/p&gt;
  3074.  
  3075. &lt;p&gt;It’s important to understand that in reviewing results, the data and bubbled up themes are from a point in time (and we know things will change yet again). Therefore we took action on the most immediate short-term areas, and will spend more time gathering insights externally and internally on the longer-term policies and programs. We care a great deal about the employee experience for all of our Rollers and will continue to bring employee sentiment into the equation.&lt;/p&gt;
  3076.  
  3077. &lt;h1 id=&quot;themes-from-roller-feedback&quot;&gt;Themes from Roller Feedback&lt;/h1&gt;
  3078. &lt;p&gt;This is a summary of the Themes we gathered from our survey:&lt;/p&gt;
  3079.  
  3080. &lt;ul&gt;
  3081.  &lt;li&gt;Continued employee support by:
  3082.    &lt;ul&gt;
  3083.      &lt;li&gt;Sending timely communications about company updates&lt;/li&gt;
  3084.      &lt;li&gt;Leaders being visible/accessible&lt;/li&gt;
  3085.    &lt;/ul&gt;
  3086.  &lt;/li&gt;
  3087.  &lt;li&gt;Opportunities to enhance support to employees by:
  3088.    &lt;ul&gt;
  3089.      &lt;li&gt;Providing a work from home stipend&lt;/li&gt;
  3090.      &lt;li&gt;Helping manage/provide resources for the increased cognitive load due to this challenging time, and overall mental health and wellness&lt;/li&gt;
  3091.    &lt;/ul&gt;
  3092.  &lt;/li&gt;
  3093.  &lt;li&gt;Return to workplace insights for consideration:
  3094.    &lt;ul&gt;
  3095.      &lt;li&gt;International locations have larger percentages of employees who are feeling more ready to return to the workplace in 2020 than other locations&lt;/li&gt;
  3096.      &lt;li&gt;All US offices have 50% or higher sentiment of not returning to the workplace until a vaccine/therapeutic is available&lt;/li&gt;
  3097.      &lt;li&gt;Do not pressure employees to return&lt;/li&gt;
  3098.    &lt;/ul&gt;
  3099.  &lt;/li&gt;
  3100. &lt;/ul&gt;
  3101.  
  3102. &lt;h1 id=&quot;take-action&quot;&gt;Take action&lt;/h1&gt;
  3103. &lt;p&gt;Based on the survey results and the themes that bubbled up, we were able to take action on the following areas right away:&lt;/p&gt;
  3104.  
  3105. &lt;ul&gt;
  3106.  &lt;li&gt;&lt;strong&gt;Communication&lt;/strong&gt;: Live stream was the most popular communication method being used to share company updates. Based on Roller’s feedback, we will continue with our weekly Town Hall meetings. We are also committed to providing regular updates through email, video, and Slack as we navigate through these challenging times.&lt;/li&gt;
  3107.  &lt;li&gt;&lt;strong&gt;Work From Home (“WFH”) Stipend&lt;/strong&gt;: Rollers will receive a monthly stipend to offset WFH expenses through the end of 2020.&lt;/li&gt;
  3108.  &lt;li&gt;&lt;strong&gt;Summer Recharge Days&lt;/strong&gt;: We heard from many Rollers that they are experiencing an increased cognitive load as they navigate extended WFH conditions, health and safety concerns, and feel they need further time off to recover and/or avoid burnout, and boost productivity. Given the uniqueness of this time, we decided to provide one Friday off during the months of June - September.&lt;/li&gt;
  3109. &lt;/ul&gt;
  3110.  
  3111. &lt;p&gt;These are just some of the immediate actions we were able to take. The more feedback we collect, the more perspectives we will receive to help us better understand the varied experiences Rollers are having, and come up with inclusive solutions during this time. It’s a top priority for us to help everyone navigate this situation with the information, resources, tools, and flexibility that we’re able to offer.&lt;/p&gt;
  3112.  
  3113. &lt;p&gt;Surveys are the fastest way to see where our current resources and programs are effective, and where we might need to lean in more. It’s important that we support our Rollers through these unprecedented times, so we will continue to survey them and ask for continuous feedback.&lt;/p&gt;
  3114.  
  3115. </description>
  3116.    </item>
  3117.    
  3118.    
  3119.    
  3120.    <item>
  3121.      <title>Coordinated Cost Savings</title>
  3122.      <link>https://tech.nextroll.com/blog/costs/2020/07/01/coordinated-cost-savings.html</link>
  3123.      <pubDate>Wed, 01 Jul 2020 00:00:00 -0700</pubDate>
  3124.      <author></author>
  3125.      <guid isPermaLink="false">https://tech.nextroll.com/blog/costs/2020/07/01/coordinated-cost-savings</guid>
  3126.      <description>
  3127. &lt;p&gt;Reducing costs on infrastructure can often feel like a chore when done in isolation. In this post, we’ll
  3128. discuss the coordinated process that NextRoll is using to make the effort feel like a full-team
  3129. project with the satisfaction of having a job well done.&lt;/p&gt;
  3130.  
  3131. &lt;p&gt;Costs have a nasty habit of accumulating over time, usually because of a wide variety of decisions
  3132. made by disparate teams that were made in the past and never revisited. It’s an ancient problem.
  3133. Aristotle writes:&lt;/p&gt;
  3134.  
  3135. &lt;blockquote&gt;
  3136.  &lt;p&gt;For that which is common to the greatest number has the least care bestowed upon it. Every one thinks chiefly
  3137. of his own, hardly at all of the common interest; and only when he is himself concerned as an individual.
  3138. For besides other considerations, everybody is more inclined to neglect the duty which he expects another
  3139. to fulfill…&lt;/p&gt;
  3140.  
  3141.  &lt;p&gt;–Aristotle&lt;/p&gt;
  3142. &lt;/blockquote&gt;
  3143.  
  3144. &lt;p&gt;Of course, the difficulty is that many small problems demand many small solutions. As such, significantly
  3145. reducing your costs can seem like an exceptionally burdensome task.&lt;/p&gt;
  3146.  
  3147. &lt;p&gt;Here are our tips if you’re looking to save a chunk out of your budget.&lt;/p&gt;
  3148.  
  3149. &lt;h4 id=&quot;1-local-vs-global-optimization&quot;&gt;1. Local vs. Global Optimization&lt;/h4&gt;
  3150.  
  3151. &lt;p&gt;Often, teams are asked to track costs on their own and apply some common sense rules to keep
  3152. them down. This works ok in practice, but it is far from optimal. One of the main issues is that
  3153. it’s very easy for a team to look at a relatively small cost-saving opportunity and it gets pushed down to
  3154. the bottom of their priority list. I mean, in the grand scheme of things, does $1000 per month
  3155. seem like &lt;em&gt;that&lt;/em&gt; big of a deal?&lt;/p&gt;
  3156.  
  3157. &lt;center&gt;
  3158. &lt;img alt=&quot;Metaphor for a non-right-sized instance.&quot; src=&quot;/images/post_images/leak.jpg&quot; /&gt;&lt;br /&gt;
  3159. &lt;i&gt;Metaphor for a non-right-sized instance.&lt;/i&gt;
  3160. &lt;/center&gt;
  3161. &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
  3162.  
  3163. &lt;p&gt;Well, probably every engineering team is making similar calls. NextRoll Engineering has dozens
  3164. of teams as a moderately-sized company. Forty-two teams each saving $1000 every month results
  3165. in an extra half million dollars in the company coffers. And the thing is, $1000 per month is
  3166. not an onerous amount to be saving when you deal with data at the scale that we do; more on that
  3167. later, but really, this represents a lower bound on what can be saved.&lt;/p&gt;
  3168.  
  3169. &lt;p&gt;Beyond this, operating costs affect the bottom line of any company. Ideally, this money would
  3170. be &lt;em&gt;invested&lt;/em&gt;, so that returns can be made: hiring more people, spinning up new products, etc.
  3171. that can meaningfully drive the top line revenue. Instead, in the best case, these costs are just
  3172. wasted, but in the worst case, actually compound over time because not investing the money elsewhere
  3173. is an opportunity cost. Also, as the company scales, these wasted costs will only increase.
  3174. Getting your unit economics in line is critical.&lt;/p&gt;
  3175.  
  3176. &lt;p&gt;To this end, we recommend creating a task force, with representatives across the engineering
  3177. organization, that is responsible for talking through all the cost-saving initiatives. This
  3178. team doesn’t need to meet frequently, but tracking and accountability are key. It also helps
  3179. everyone get on the same page about how systems are built and interoperate, revealing more
  3180. opportunities.&lt;/p&gt;
  3181.  
  3182. &lt;h4 id=&quot;2-solicit-ideas&quot;&gt;2. Solicit Ideas&lt;/h4&gt;
  3183.  
  3184. &lt;p&gt;No one knows the code better than the engineers who work on their systems every day. Some in
  3185. management may be tempted to set arbitrary cost saving goals, but this is extremely
  3186. counterproductive. It’s also not enough to demand some vague sense of “cost savings” as a
  3187. project. In reality, each individual cost-saving initiative is unique in its own right and has
  3188. its own set of tradeoffs that the engineers are best equipped to evaluate.&lt;/p&gt;
  3189.  
  3190. &lt;p&gt;To this end, ask every engineering team to submit their ideas. Each team can meet amongst
  3191. themselves, spend an hour or two just brainstorming, and write everything down. At
  3192. NextRoll, we take an attitude of every idea being valid in this stage, no matter how difficult
  3193. to accomplish. In some ways, this is similar to a design sprint: the ideas must be on the table
  3194. before further discussions around feasibility and prioritization can happen. There are side
  3195. benefits as well: small ideas can be enhanced or compounded; new product ideas can emerge; wild
  3196. ideas can inspire other, more feasible ones.&lt;/p&gt;
  3197.  
  3198. &lt;p&gt;One of NextRoll’s core values is, “Do more with less.” We try to consistently uphold this,
  3199. but these cost-saving efforts provide a good opportunity for us to reflect on that value. It’s
  3200. important during brainstorming that engineers ask themselves not, “What do we want,” or “What
  3201. is convenient,” but rather, “What do we &lt;em&gt;need&lt;/em&gt;?”&lt;/p&gt;
  3202.  
  3203. &lt;p&gt;Everyone is contributing. This creates a shared sense of ownership and engineers
  3204. become invested in their own ideas. Finger-pointing at the teams with the largest cost centers
  3205. melt away, because, hey, we already know the little stuff adds up and we’re all in this
  3206. together. Every team is trying to reduce their footprint to only what’s required.
  3207. Every team lists out some potential objectives and it’s clear how these relate to
  3208. the overarching theme and goal of the project.&lt;/p&gt;
  3209.  
  3210. &lt;p&gt;Once all of the ideas are generated, they need to be submitted to the aforementioned task force.
  3211. It’s up to the task force to meet and decide which tasks will be tackled. Hopefully the task
  3212. force has enough representation across engineering that they know enough context to discuss the
  3213. proposals intelligently. But just in case, these ideas should contain documentation that includes:&lt;/p&gt;
  3214.  
  3215. &lt;ul&gt;
  3216.  &lt;li&gt;The estimated amount of savings, in a standard unit (such as dollars/month)&lt;/li&gt;
  3217.  &lt;li&gt;The estimated amount of engineering effort, in a standard unit (such as eng-weeks)&lt;/li&gt;
  3218.  &lt;li&gt;A high-level description of the proposal&lt;/li&gt;
  3219.  &lt;li&gt;A list of any open questions that would need to be resolved before tackling the project&lt;/li&gt;
  3220.  &lt;li&gt;A list of impacted teams&lt;/li&gt;
  3221. &lt;/ul&gt;
  3222.  
  3223. &lt;p&gt;Estimates do not need to be super accurate. The idea is to get a rough cut of what is possible.
  3224. If you have enough ideas submitted, it is likely that some estimates will be over and some will
  3225. be under. Statistically, over all submissions, you’ll probably hit a reasonable total
  3226. estimate of how much money is on the table for the taking.&lt;/p&gt;
  3227.  
  3228. &lt;p&gt;From here, the task force can compile the list of items in a centralized location, assess the
  3229. total opportunity on each team and across the organization, and be prepared for their first meeting.&lt;/p&gt;
  3230.  
  3231. &lt;p&gt;Another good tip for soliciting ideas is to set a reasonable minimum on how much a submission
  3232. can save. This will keep brainstorming meetings focused and reduce the number of items the
  3233. task force has to prioritize. What that minimum should be is going to depend on any number of
  3234. variables, such as revenue, current total costs, number of teams, desired savings goals, etc.
  3235. Pick something that fits your company’s situation.&lt;/p&gt;
  3236.  
  3237. &lt;h4 id=&quot;3-task-force-prioritization&quot;&gt;3. Task Force Prioritization&lt;/h4&gt;
  3238.  
  3239. &lt;p&gt;The task force finally needs to meet and discuss &lt;em&gt;all&lt;/em&gt; items. This will likely be a
  3240. time-consuming meeting, but it will be valuable. We’ll talk about side benefits later, but
  3241. for now, let’s talk about how to prioritize.&lt;/p&gt;
  3242.  
  3243. &lt;p&gt;One thing that’s critical to keep in mind is that individual teams still have product roadmaps
  3244. to execute on. Cost reduction doesn’t happen in a vacuum. Representatives should be able to
  3245. explain to the group what the current workload for each individual team is, and how much
  3246. engineering time is available to spare. But time isn’t free, so product management needs to
  3247. be on board with the overall effort, and recognize that some time is going to be spent on
  3248. getting unit costs down.&lt;/p&gt;
  3249.  
  3250. &lt;p&gt;At this point, it’s common sense to tackle the biggest bang-for-the-buck items. As mentioned
  3251. prior, NextRoll is fundamentally a data company. A really easy place for things to build up
  3252. are S3 costs. My background is data science; my personal inclination is to keep plenty of data
  3253. around so I can look at historical models, their performance, and so on. But how often do we
  3254. really need to go back, say, 90 days? Is 45 enough? These are the questions we were asking
  3255. ourselves on my teams, and we realized there were a lot of five-minute tasks to set some TTLs
  3256. that slashed our budgets pretty significantly.&lt;/p&gt;
  3257.  
  3258. &lt;center&gt;
  3259. &lt;img alt=&quot;Metaphor for a typical S3 bucket.&quot; src=&quot;/images/post_images/junkyard.jpg&quot; /&gt;&lt;br /&gt;
  3260. &lt;i&gt;Metaphor for a typical S3 bucket.&lt;/i&gt;
  3261. &lt;/center&gt;
  3262. &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
  3263.  
  3264. &lt;p&gt;Right-sizing instances is another big opportunity; as products develop over time, the needs
  3265. for their servers can shift. It’s also possible to identify less-used features in the product
  3266. that are supported by data and infrastructure and to just delete and shut off those services
  3267. entirely. For high-volume systems, even a 10% improvement in efficiency can meaningfully impact
  3268. your budget.&lt;/p&gt;
  3269.  
  3270. &lt;p&gt;A common question at NextRoll was, “What about AWS reservations? If we’ve already paid for
  3271. servers, why should we shut them down?” Sure, but this is something that should largely
  3272. factor into prioritization. The reservations buy you time to reduce the number of servers
  3273. you need. Try to avoid falling for a sunk cost fallacy: once the money is spent it’s gone.
  3274. Those reservations could be used for other services that need them more, or could be used to
  3275. spin up new product features. Also, reservations will eventually expire. Even though you’ve
  3276. bought some time, try to get ahead of the problem and don’t lose focus on the end goal. Chances
  3277. are, that expiration date will sneak up on you, and you’ll be in a new product development
  3278. cycle that may be hard to find the time. The point is to strike while the iron’s hot.&lt;/p&gt;
  3279.  
  3280. &lt;p&gt;Once the task force has balanced every team’s roadmaps and selected a set of appropriate items,
  3281. they organize their selections into a centralized document that all teams can reference. The
  3282. selected items get pushed down to the individual teams for implementation with some expectation
  3283. on when the items will be completed. At NextRoll, we’ve opted for a quarterly deadline. Like any
  3284. engineering project, deadlines can slip, but what’s important is to be on top of the tracking.&lt;/p&gt;
  3285.  
  3286. &lt;p&gt;Some of the side benefits of this task force meeting are that participants gain a better sense
  3287. of how other systems they aren’t responsible for work and interact with each other. They also
  3288. get a better sense of other teams’ product roadmaps. If they’re technical leaders for their
  3289. respective teams, they can bring that knowledge back. On top of all this, the task force
  3290. meeting is yet another opportunity to brainstorm and share notes. As the team goes through all
  3291. the items, new opportunities and solutions may present themselves that can be discussed with
  3292. the implementing teams.&lt;/p&gt;
  3293.  
  3294. &lt;h4 id=&quot;4-tracking&quot;&gt;4. Tracking&lt;/h4&gt;
  3295.  
  3296. &lt;p&gt;I’m a big fan of this quote, probably apocryphally attributed to Karl Pearson, and likely
  3297. expressed by others before him:&lt;/p&gt;
  3298.  
  3299. &lt;blockquote&gt;
  3300.  &lt;p&gt;“That which is measured improves. That which is measured and reported improves exponentially.”&lt;/p&gt;
  3301.  
  3302.  &lt;p&gt;–Karl Pearson&lt;/p&gt;
  3303. &lt;/blockquote&gt;
  3304.  
  3305. &lt;p&gt;After each team implements an item, they should spend some time to measure the actualized
  3306. savings and report back. The task force can log these data as they come in, and track how
  3307. well estimates line up with reality. This feedback loop will hopefully lead to better
  3308. estimates in the future.&lt;/p&gt;
  3309.  
  3310. &lt;p&gt;The most obvious benefit here is that these numbers can help FP&amp;amp;A teams with their jobs.
  3311. Understanding the overall impact on the company justifies the effort spent achieving the
  3312. results. Measuring the return on investment will also help in the future when it’s time
  3313. to tackle further initiatives.&lt;/p&gt;
  3314.  
  3315. &lt;p&gt;And, of course, measuring the results provides engineers with some actual satisfaction by
  3316. seeing the results of their labors. Since we approached this collectively, every engineer
  3317. contributed to the outcome. The flipside of the “death by a thousand cuts,” is that the
  3318. thousand bandages add up to something big and meaningful. The task force should take some
  3319. time to communicate the summarized results to not only the whole engineering team, but the
  3320. whole company. Whatever medium works for you is fine; at NextRoll we report on our results
  3321. at our All Hands meetings.&lt;/p&gt;
  3322.  
  3323. &lt;p&gt;At NextRoll, the task force meets monthly to quickly go over updates. We chat about progress,
  3324. blockers, and celebrate successes. It’s not particularly different from a scrum of scrums.
  3325. With the distributed contributions of every engineer, the task force meeting is not
  3326. particularly onerous because they’ve been provided with the information they need ahead of time.&lt;/p&gt;
  3327.  
  3328. &lt;h4 id=&quot;5-lather-rinse-repeat&quot;&gt;5. Lather, Rinse, Repeat&lt;/h4&gt;
  3329.  
  3330. &lt;p&gt;Once the original deadline is passed, the task force reviews unfinished items, reevaluates
  3331. items that weren’t prioritized in the first round, and continues to prioritize and push
  3332. things down to the teams for the next period. This ensures that harder, but perhaps bigger,
  3333. opportunities don’t become forgotten and dropped. Recall that the first pass was about the
  3334. best bang-for-the-buck projects. Things will get harder, but with an appropriate balance,
  3335. the payoffs should be higher in the next round.&lt;/p&gt;
  3336.  
  3337. &lt;p&gt;Momentum shouldn’t be lost. Cost saving should not be a one-off initiative because, as
  3338. stated, costs tend to accumulate over time. In some sense, it’s easier to
  3339. tackle things as they come up; on the other hand, it’s not easy to change culture.
  3340. Explicitly maintaining cost reduction as a priority sends a strong message that the value
  3341. of “Do more with less” is not empty.&lt;/p&gt;
  3342.  
  3343. &lt;p&gt;One thing that I’ve noticed at NextRoll that’s been great is that individual engineers
  3344. continue to ping me about new opportunities they’ve managed to uncover, even outside of
  3345. brainstorming sessions. We add these items to our overall tracking and make sure these
  3346. ideas don’t go unnoticed or unprioritized. Sometimes I even get pinged with cost reduction
  3347. measures that have already been completed, and I make sure even those things are
  3348. documented and tracked to recognize what these engineers have accomplished.&lt;/p&gt;
  3349.  
  3350. &lt;p&gt;And this is really the attitude that we’re looking for. Yes, it’s a decent amount of
  3351. upfront effort to document and coordinate everything. But once the process is in place
  3352. to solicit creative ideas from all sides, the tracking mechanisms are in place to measure
  3353. accomplishments, and recognition is provided, cost savings becomes less of a chore and
  3354. hopefully a rewarding process unto itself.&lt;/p&gt;
  3355.  
  3356. &lt;h4 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h4&gt;
  3357.  
  3358. &lt;p&gt;I hope this provides a glimpse into our cost saving process at NextRoll. Obviously, not
  3359. all of these processes are necessarily applicable to your own company or situation. Adapt
  3360. as you see fit, but whatever process you decide to implement, I strongly recommend
  3361. adhering to the following principles:&lt;/p&gt;
  3362.  
  3363. &lt;ul&gt;
  3364.  &lt;li&gt;Small stuff adds up. Think globally, act locally, as they say.&lt;/li&gt;
  3365.  &lt;li&gt;Put the idea generation in the hands of the engineers. They know what they’re doing.&lt;/li&gt;
  3366.  &lt;li&gt;Crazy ideas are fun and can spur further innovation.&lt;/li&gt;
  3367.  &lt;li&gt;Centralize coordination and prioritization.&lt;/li&gt;
  3368.  &lt;li&gt;Estimate. Measure. Report.&lt;/li&gt;
  3369.  &lt;li&gt;Recognize.&lt;/li&gt;
  3370.  &lt;li&gt;Continue.&lt;/li&gt;
  3371. &lt;/ul&gt;
  3372.  
  3373. &lt;p&gt;Thanks for reading!&lt;/p&gt;
  3374. </description>
  3375.    </item>
  3376.    
  3377.    
  3378.    
  3379.    <item>
  3380.      <title>How to save a lot of money with a Baker in the spot market?</title>
  3381.      <link>https://tech.nextroll.com/blog/dev/2020/06/16/how-we-saved-with-spot-market.html</link>
  3382.      <pubDate>Tue, 16 Jun 2020 00:00:00 -0700</pubDate>
  3383.      <author></author>
  3384.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2020/06/16/how-we-saved-with-spot-market</guid>
  3385.      <description>&lt;p&gt;OMFG (one-minute file generator) is a service that reads log lines from many Kinesis streams and saves them to S3, creating one compressed file every minute.
  3386.  Its first version was written long ago using &lt;a href=&quot;https://storm.apache.org/&quot;&gt;Apache Storm&lt;/a&gt;, but it became expensive and sometimes unstable over time.
  3387.  Our target was to reduce the service cost significantly and create a 0-downtime infrastructure.&lt;/p&gt;
  3388.  
  3389. &lt;p&gt;We decided to completely replace the software using Baker, our in-house data processing software (written in Go). We deployed it across all AWS regions by using spot instances only.&lt;/p&gt;
  3390.  
  3391. &lt;p&gt;The final result exceeded expectations, &lt;strong&gt;as we reduced the overall service cost by over 85%.&lt;/strong&gt;&lt;/p&gt;
  3392.  
  3393. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;15 minute read&lt;/code&gt;&lt;/p&gt;
  3394.  
  3395. &lt;hr /&gt;
  3396.  
  3397. &lt;h2 id=&quot;omfg-overview-and-improvement-ideas&quot;&gt;OMFG, overview and improvement ideas&lt;/h2&gt;
  3398.  
  3399. &lt;center&gt;
  3400.    &lt;img alt=&quot;The old OMFG version&quot; src=&quot;/images/post_images/omfg_old_infra.jpg&quot; /&gt;&lt;br /&gt;
  3401.    &lt;i&gt;The old OMFG  version in the NextRoll infrastructure&lt;/i&gt;
  3402.    &lt;p&gt;&lt;/p&gt;
  3403. &lt;/center&gt;
  3404.  
  3405. &lt;p&gt;Bidders and AdServers produce many terabytes of log lines every day. Those logs are available for consumers through hundreds of kinesis streams spread across 5 AWS regions. Part of those log lines (around 8TB/day) are useful to other services and should be quickly available on S3. Here OMFG enters the scene.&lt;/p&gt;
  3406.  
  3407. &lt;p&gt;Every minute OMFG takes what has read from Kinesis during the previous minute and compresses it into a single &lt;a href=&quot;https://facebook.github.io/zstd/&quot;&gt;zstandard&lt;/a&gt; file per stream, immediately uploaded to S3.&lt;/p&gt;
  3408.  
  3409. &lt;p&gt;The &lt;a href=&quot;https://storm.apache.org/&quot;&gt;Storm&lt;/a&gt; version used 6 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m5ad.4xlarge&lt;/code&gt; EC2 reserved instances plus 2 additional &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;c5d.large&lt;/code&gt; machines. We thought that using Baker, which is heavily optimized to process log lines, we would have immediately been able to shut down some machines. Moreover, we also decided to move to use EC2 instances from the spot market as &lt;a href=&quot;https://tech.nextroll.com/blog/dev/ops/2018/10/15/x-marks-the-spot.html&quot;&gt;previous experiences&lt;/a&gt; told us it’s an excellent way to save some money.&lt;/p&gt;
  3410.  
  3411. &lt;p&gt;A possible alternative to Baker could have been &lt;a href=&quot;https://aws.amazon.com/kinesis/data-firehose/&quot;&gt;Kinesis Data Firehose&lt;/a&gt;, whose job is exactly to “reliably load streaming data into data lakes, data stores and analytics tools”, including S3, our target. Unfortunately, it doesn’t support zstd compression and, moreover, its costs would have been too high: the data streams consumed by OMFG produce more than 1 PetaByte of log lines each month, which means more than $30k/month (not including S3 and data transfer costs). That’s way more than what we ended spending with our custom solution.&lt;/p&gt;
  3412.  
  3413. &lt;p&gt;Analyzing the service’s detailed costs we realized that only ~25% was spent on the instances, while the data transfer across regions cost around 70%.&lt;br /&gt;
  3414. That was somewhat surprising and it seemed clear that it should be the first area to focus on to make savings.&lt;/p&gt;
  3415.  
  3416. &lt;h2 id=&quot;baker-our-in-house-data-processing-software&quot;&gt;Baker, our in-house data processing software&lt;/h2&gt;
  3417.  
  3418. &lt;p&gt;Back in 2016 we decided to try to write a data processor pipeline using &lt;a href=&quot;https://golang.org/&quot;&gt;Go&lt;/a&gt;. The results were so good that the “experiment”, now called &lt;strong&gt;Baker&lt;/strong&gt;, has been highly successful here at NextRoll during the past years and it’s currently used by many teams to process data at PetaByte scale.&lt;/p&gt;
  3419.  
  3420. &lt;p&gt;Baker works on a “topology”, the description of a pipeline including one input component (where to read the loglines from), zero or more filters, which are functions that can modify loglines (changing fields, dropping them, or even “splitting” them into multiple loglines), and one output component, defining where to send the resulting loglines to (and which columns), also optionally adding a sharding rule. Input/output components support the main AWS services (Kinesis, DyanomDB, S3, SQS) as well as relational databases and in-house products (like the open-sourced &lt;a href=&quot;http://traildb.io/&quot;&gt;TrailDB&lt;/a&gt;).&lt;br /&gt;
  3421. Baker is fully parallel and maximizes the usage of both CPU-bound and I/O bound pipelines. On an AWS &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;c3.4xlarge&lt;/code&gt; instance, it can run a simple pipeline (with little processing of log lines) achieving 30k writes per seconds to DynamoDB in 4 regions, using ~1GB of RAM in total, and only a portion of the available CPU (so with room for scaling even further if required).&lt;/p&gt;
  3422.  
  3423. &lt;p&gt;It is very stable and so optimized for memory and CPU consumption that replacing an existing pipeline with Baker often means greatly reducing the number of servers needed.&lt;br /&gt;
  3424. &lt;strong&gt;This has also been the case for OMFG:&lt;/strong&gt; with Baker we only use 6 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;c5.xlarge&lt;/code&gt; instances. Compared to the old OMFG version it means 24 vCPUs and 48 GB of RAM instead of 100 vCPUs &lt;strong&gt;(-76%)&lt;/strong&gt; and 392 GB of RAM &lt;strong&gt;(-87%)&lt;/strong&gt;.&lt;/p&gt;
  3425.  
  3426. &lt;p&gt;We are actively working to make Baker an open source software. It will be publicly available over the next months, so &lt;a href=&quot;https://github.com/adroll/&quot;&gt;stay tuned&lt;/a&gt;!&lt;/p&gt;
  3427.  
  3428. &lt;h2 id=&quot;aws-data-transfer-between-regions&quot;&gt;AWS data transfer between regions&lt;/h2&gt;
  3429.  
  3430. &lt;center&gt;
  3431.    &lt;img alt=&quot;AWS datacenter are spread all around the globe&quot; src=&quot;/images/post_images/omfg_aws_regions.png&quot; /&gt;&lt;br /&gt;
  3432.    &lt;i&gt;AWS datacenter are spread all around the globe&lt;/i&gt;
  3433.    &lt;p&gt;&lt;/p&gt;
  3434. &lt;/center&gt;
  3435.  
  3436. &lt;p&gt;As I mentioned previously, the data transfer to our main region &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;us-west-2&lt;/code&gt; from the other regions represented by far the most significant costs of OMFG.&lt;/p&gt;
  3437.  
  3438. &lt;p&gt;The original OMFG service was deployed entirely in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;us-west-2&lt;/code&gt; while the log lines were produced (and thus read by OMFG) in all the 5 regions used by NextRoll. The log lines, produced inside Kinesis, are uncompressed, and transferring all those bytes to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;us-west-2&lt;/code&gt; was very expensive. In fact, AWS charges for outbound or inter-region bandwidth, while it doesn’t charge for data transfer within the same availability zone or data transfer between some services in the same region (like Kinesis to EC2).&lt;br /&gt;
  3439. For this reason we decided to create a split infrastructure, deploying OMFG on multiple ECS clusters across the globe (one per region), reading uncompressed data from the Kinesis in the same region (with no costs) and transferring the files to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;us-west-2&lt;/code&gt; after compression.&lt;/p&gt;
  3440.  
  3441. &lt;p&gt;&lt;em&gt;Fun fact about &lt;a href=&quot;https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer&quot;&gt;EC2 data transfer prices&lt;/a&gt; is that, while outbound bandwidth costs decrease when the volume increases, data transfer prices between AWS regions are constant. The latter is generally lower than the former, but it can be unexpectedly higher in some cases. For example, data transfer from EC2 to the internet in Tokyo costs $0.084 per GB when transferring more than 150TB/month, while transferring data to another region costs $0.09 per GB. In Singapore, it’s even more: $0.09 per GB between regions and $0.08 per GB to the internet.&lt;/em&gt;&lt;/p&gt;
  3442.  
  3443. &lt;center&gt;
  3444.    &lt;img alt=&quot;The new OMFG&quot; src=&quot;/images/post_images/omfg_new_infra.jpg&quot; /&gt;&lt;br /&gt;
  3445.    &lt;i&gt;The new OMFG, deployed in multiple regions&lt;/i&gt;
  3446.    &lt;p&gt;&lt;/p&gt;
  3447. &lt;/center&gt;
  3448.  
  3449. &lt;p&gt;The &lt;strong&gt;compression ratio&lt;/strong&gt; was the key to this idea of distributed deployment. An overly low ratio would not have balanced the distributed deployment costs of a centralized solution.&lt;br /&gt;
  3450. The default zstd compression level (3) is good enough (~9x compression ratio), but after some testing we found out that it’s easy to obtain better results using the &lt;a href=&quot;https://github.com/facebook/zstd/releases/tag/v1.3.2&quot;&gt;long range mode&lt;/a&gt;.&lt;br /&gt;
  3451. In the chart below we can see an example of compression ratios on one of our log line types. Also without the long range mode (red line) it’s possible to increase the default compression ratio using higher values, but the cost in terms of lost speed is not particularly low. Level 10 seemed a good compromise for us because it performs almost like smaller levels but has a good jump in the ratio (&amp;gt;10x).&lt;/p&gt;
  3452.  
  3453. &lt;p&gt;Maintaining the same compression level (10) but using the long range mode (enabled with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--long&lt;/code&gt; option) at level 27 (the default, with a 128MB window size) we see a big jump to &amp;gt;13x ratio (blue line) with a small decrease of speed. Larger window sizes still slightly increase the compression ratio. Still, they have a significant drawback: the default zstd decompression parameters allow for using values of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--long&lt;/code&gt; option up to 27 without having to add any new option to the decompressor. Beyond that number, the decompressor must use an equivalent parameter, which would require updating each reader, which was incompatible with our use case.&lt;/p&gt;
  3454.  
  3455. &lt;center&gt;
  3456.    &lt;img alt=&quot;Zstd --long option values&quot; src=&quot;/images/post_images/omfg_zstd_long_option.png&quot; /&gt;&lt;br /&gt;
  3457.    &lt;i&gt;How the zstandard long range mode impacts the compression ratio&lt;/i&gt;
  3458.    &lt;p&gt;&lt;/p&gt;
  3459. &lt;/center&gt;
  3460.  
  3461. &lt;p&gt;So, ultimately, we used the compression level 10 with the default long range mode window size.&lt;br /&gt;
  3462. The resulting compression ratios range &lt;strong&gt;from 4x to 30x on average&lt;/strong&gt;, with peaks at 100x (performances depend on log lines type and sizes).&lt;/p&gt;
  3463.  
  3464. &lt;h2 id=&quot;ecs-and-spot-market-a-not-so-smooth-relationship&quot;&gt;ECS and spot market, a not-so-smooth relationship&lt;/h2&gt;
  3465.  
  3466. &lt;center&gt;
  3467.    &lt;img alt=&quot;ECS and spot market, a not-so-smooth relationship&quot; src=&quot;/images/post_images/omfg_troubles.png&quot; /&gt;&lt;br /&gt;
  3468. &lt;/center&gt;
  3469.  
  3470. &lt;p&gt;The new OMFG runs on &lt;a href=&quot;https://aws.amazon.com/ecs/&quot;&gt;Elastic Container Service&lt;/a&gt; clusters as &lt;a href=&quot;https://www.docker.com/&quot;&gt;Docker containers&lt;/a&gt;. We decided to use &lt;a href=&quot;https://aws.amazon.com/ec2/spot/&quot;&gt;EC2 spot instances&lt;/a&gt; to provide computational power because they can ensure significant savings (&amp;gt;65% on the instance types we use). But by its very nature, a spot instance &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html&quot;&gt;can be stopped&lt;/a&gt; by Amazon at any time, and the availability of the chosen instance types can change based on the fluctuation in the market demand.&lt;/p&gt;
  3471.  
  3472. &lt;p&gt;To address the change in availability we started introducing a long list of possible instance types, decreasing the chance of running out of instances.&lt;br /&gt;
  3473. &lt;strong&gt;But spot interruptions will still happen&lt;/strong&gt;, and OMFG must provide an almost realtime service, with a maximum delay in uploading the files to S3 of ~5 minutes. In small clusters such as ours (1-2 instances per region), an interruption has a vast impact, stopping 40-100% of the running docker containers. The Auto Scaling Group (ASG) we use to manage the active instances in the cluster requires some time to perceive the interruption and replace the missing machine, &lt;strong&gt;with an average downtime of 8 minutes&lt;/strong&gt;, too high for us.&lt;/p&gt;
  3474.  
  3475. &lt;p&gt;When AWS decides to stop a spot instance, 120 seconds are conceded before the operating system enters the shutdown phase. During these 2 minutes, it is necessary to perform a graceful shutdown saving all data currently been processed by Baker. Otherwise, it will be lost (there is, in any case, a way to recover data if using non-“volatile” disks, a subject that we wrote about &lt;a href=&quot;https://tech.nextroll.com/blog/dev/ops/2018/10/15/x-marks-the-spot.html#spot-savior-a-log-data-recovery-system&quot;&gt;in a past article&lt;/a&gt;).&lt;br /&gt;
  3476. Since an official solution doesn’t exist to address the spot instance eviction, we had to craft our own.&lt;/p&gt;
  3477.  
  3478. &lt;h4 id=&quot;solution&quot;&gt;Solution&lt;/h4&gt;
  3479.  
  3480. &lt;p&gt;The solution we have found is to check whether a server eviction is happening and, where necessary, to perform a controlled shutdown of the containers. AWS provides notification of imminent shutdown through the &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html&quot;&gt;Instance Metadata Service&lt;/a&gt; at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;http://169.254.169.254/latest/meta-data/spot/instance-action&lt;/code&gt;, which returns a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;200&lt;/code&gt; HTTP code only when the instance is in the interruption phase. That’s the signal that triggers our cleanup.&lt;/p&gt;
  3481.  
  3482. &lt;p&gt;This excerpt from the script that we use is executed as a daemon by the operating system (we forge our AMIs with &lt;a href=&quot;https://www.packer.io/&quot;&gt;Packer&lt;/a&gt;). It checks the spot interruption endpoint every 5 seconds and, when the interruption is intercepted, it increases the number of desired instances in the ASG, triggering an immediate launch of a new server. Moreover, it sends a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SIGINT&lt;/code&gt; signal to the Docker containers, which is captured by Baker to perform a graceful shutdown, stop the Kinesis consumers and upload all the files before quitting (this generally requires less than 1 minute):&lt;/p&gt;
  3483.  
  3484. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;#!/bin/bash&lt;/span&gt;
  3485.  
  3486. &lt;span class=&quot;nv&quot;&gt;REGION&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;curl &lt;span class=&quot;nt&quot;&gt;-s&lt;/span&gt; http://169.254.169.254/latest/dynamic/instance-identity/document|grep region|awk &lt;span class=&quot;nt&quot;&gt;-F&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&quot;&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;{print $4}&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  3487. &lt;span class=&quot;nv&quot;&gt;ASG&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&amp;lt;asg_name&amp;gt;&quot;&lt;/span&gt;
  3488.  
  3489. &lt;span class=&quot;k&quot;&gt;while &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do
  3490.    &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;CODE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;curl &lt;span class=&quot;nt&quot;&gt;-LI&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; /dev/null &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;%{http_code}\n&apos;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-s&lt;/span&gt; http://169.254.169.254/latest/meta-data/spot/instance-action&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  3491.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;CODE&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;200&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then&lt;/span&gt;
  3492.        &lt;span class=&quot;c&quot;&gt;# ...&lt;/span&gt;
  3493.  
  3494.        &lt;span class=&quot;c&quot;&gt;# Kill gracefully all OMFG containers&lt;/span&gt;
  3495.        &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;i &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;docker ps &lt;span class=&quot;nt&quot;&gt;--filter&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;name=omfg&apos;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-q&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  3496.        &lt;span class=&quot;k&quot;&gt;do
  3497.            &lt;/span&gt;docker &lt;span class=&quot;nb&quot;&gt;kill&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--signal&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;SIGINT &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  3498.        &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;
  3499.  
  3500.        &lt;span class=&quot;c&quot;&gt;# Immediately increase the desired instances in the ASG&lt;/span&gt;
  3501.        &lt;span class=&quot;nv&quot;&gt;CURRENT_DESIRED&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;aws autoscaling describe-auto-scaling-groups &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;REGION&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--auto-scaling-group-names&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ASG&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt; | &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  3502.            jq &lt;span class=&quot;s1&quot;&gt;&apos;.AutoScalingGroups | .[0] | .DesiredCapacity&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  3503.        &lt;span class=&quot;nv&quot;&gt;NEW_DESIRED&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;$((&lt;/span&gt;CURRENT_DESIRED &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;))&lt;/span&gt;
  3504.        aws autoscaling set-desired-capacity &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;REGION&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--auto-scaling-group-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ASG&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--desired-capacity&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;NEW_DESIRED&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  3505.  
  3506.        &lt;span class=&quot;c&quot;&gt;# ...&lt;/span&gt;
  3507.    &lt;span class=&quot;k&quot;&gt;fi
  3508.    &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sleep &lt;/span&gt;5
  3509. &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  3510.  
  3511. &lt;p&gt;Together with this script, the ECS agent in the instance is configured to set the instance status to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DRAINING&lt;/code&gt; when it receives a spot interruption notice (automatically done by ECS when &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ECS_ENABLE_SPOT_INSTANCE_DRAINING=true&lt;/code&gt; is set in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/etc/ecs/ecs.config&lt;/code&gt;, see the &lt;a href=&quot;https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-config.html&quot;&gt;official documentation&lt;/a&gt; for details).&lt;/p&gt;
  3512.  
  3513. &lt;p&gt;Setting the instance status as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DRAINING&lt;/code&gt; results in stopped containers not starting again in that server but, instead, being launched in the new instance when ready.&lt;/p&gt;
  3514.  
  3515. &lt;center&gt;
  3516.    &lt;img alt=&quot;OMFG spot interception&quot; src=&quot;/images/post_images/omfg_spot_intercept.jpg&quot; /&gt;&lt;br /&gt;
  3517.    &lt;i&gt;How OMFG manages a spot interruption&lt;/i&gt;
  3518.    &lt;p&gt;&lt;/p&gt;
  3519. &lt;/center&gt;
  3520.  
  3521. &lt;p&gt;&lt;strong&gt;This solution permits a final downtime of less than a minute.&lt;/strong&gt;&lt;/p&gt;
  3522.  
  3523. &lt;p&gt;There’s only a small downside that is shown in the picture above: adding a desired instance to the ASG during the eviction means that another instance is launched when the interrupted machine is stopped, resulting in an exceeding server compared to the previous desired number of instances.&lt;br /&gt;
  3524. This additional machine though will only live for a few minutes, until the ASG will shut it down because its usage is too low, bringing the ASG configuration to its default values.&lt;br /&gt;
  3525. Not perfect but still a good and working solution.&lt;/p&gt;
  3526.  
  3527. &lt;h2 id=&quot;containers-autobalance&quot;&gt;Containers autobalance&lt;/h2&gt;
  3528.  
  3529. &lt;p&gt;ECS lacks one important feature: balancing the docker containers across the available EC2 instances.&lt;br /&gt;
  3530. When a cluster is freshly launched the containers are created in a balanced manner (same number on each server), but when an instance interruption occurs, the tasks are stopped and started again into the available machines (and this happens in seconds, before the new one starts). Then the new machine will be empty or at most with the tasks remaining from the previous operation (because they couldn’t find sufficient free resources on the existing servers to be launched).&lt;/p&gt;
  3531.  
  3532. &lt;p&gt;As ECS does not provide tools to balance the tasks in the cluster automatically, we, therefore, need to do it ourselves.&lt;/p&gt;
  3533.  
  3534. &lt;h4 id=&quot;solution-1&quot;&gt;Solution&lt;/h4&gt;
  3535.  
  3536. &lt;p&gt;The solution we have found is to run a scheduled task every 15 minutes checking the number of tasks in each &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUNNING&lt;/code&gt; instance and balancing them if the difference is excessive.&lt;br /&gt;
  3537. The job “counts” the number of containers per server and moves them if the cluster is unbalanced (count difference &amp;gt;35%).&lt;/p&gt;
  3538.  
  3539. &lt;center&gt;
  3540.    &lt;img alt=&quot;OMFG autobalancing&quot; src=&quot;/images/post_images/omfg_balancing.jpg&quot; /&gt;&lt;br /&gt;
  3541.    &lt;i&gt;How the autobalancing scheduled task works&lt;/i&gt;
  3542.    &lt;p&gt;&lt;/p&gt;
  3543. &lt;/center&gt;
  3544.  
  3545. &lt;h2 id=&quot;replacing-all-instances-without-downtime&quot;&gt;Replacing all instances without downtime&lt;/h2&gt;
  3546.  
  3547. &lt;p&gt;Sooner or later it will be necessary to replace the instances of the cluster with new versions, typically because the AMI has been updated.&lt;br /&gt;
  3548. In itself, the operation is not particularly complicated, essentially only manually turn off the machines and they will be replaced with new units that use the updated version of the AMI.&lt;br /&gt;
  3549. &lt;strong&gt;The problem is that this operation leads to a downtime of the service&lt;/strong&gt;. That is between the moment in which the instances are turned off and when, after restarting, all the tasks have been automatically launched (&lt;strong&gt;more than 8 minutes&lt;/strong&gt; in our tests). We then found a solution that allows us to launch all the tasks in the new machines before turning off the old ones and, thus, the old instances, taking advantage of the operation of the autoscaling groups.&lt;/p&gt;
  3550.  
  3551. &lt;h4 id=&quot;solution-2&quot;&gt;Solution&lt;/h4&gt;
  3552.  
  3553. &lt;p&gt;The trick is to implement the following steps, which can be performed manually as well as by an automatic script that makes calls to AWS APIs (the latter is our solution):&lt;/p&gt;
  3554.  
  3555. &lt;ol&gt;
  3556.  &lt;li&gt;Set all instances of the cluster to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DRAINING&lt;/code&gt; state. Nothing will happen as there are no other &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ACTIVE&lt;/code&gt; instances on which to move tasks.&lt;/li&gt;
  3557.  &lt;li&gt;Double the number of desired instances in the autoscaling group. These new requests are launched in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ACTIVE&lt;/code&gt; state and all the tasks are started on them thanks to the previous step. As they are launched, similar tasks are also turned off in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DRAINING&lt;/code&gt; units. Baker, as mentioned above, performs a graceful shutdown saving all its current work.&lt;/li&gt;
  3558.  &lt;li&gt;Wait until all the tasks have been turned off on old instances, which means they are active on new ones.&lt;/li&gt;
  3559.  &lt;li&gt;Return the ASG number to the previous value. The ASG shuts down the excess machines bringing the situation back to its initial state.&lt;/li&gt;
  3560. &lt;/ol&gt;
  3561.  
  3562. &lt;p&gt;This latter point deserves a broader focus. Turning off excess machines does not in itself guarantee that the machines just launched are not turned off again.&lt;br /&gt;
  3563. The key is to set the termination policy of the autoscaling group to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OldestInstance&lt;/code&gt;. We are telling the ASG that, if a unit has to end, the choice must fall on the oldest one, precisely what we want.&lt;/p&gt;
  3564.  
  3565. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  3566.  
  3567. &lt;p&gt;What is the final outcome of our work?&lt;/p&gt;
  3568.  
  3569. &lt;ul&gt;
  3570.  &lt;li&gt;We saved 85% on EC2 costs migrating to Baker (and thus reducing the number and size of instances) and moving to the spot market.&lt;/li&gt;
  3571.  &lt;li&gt;We also saved 90% on data transfer costs compressing all records in their origin regions before sending them to S3.&lt;/li&gt;
  3572.  &lt;li&gt;We incurred some minor additional costs on DynamoDB due to &lt;a href=&quot;https://github.com/vmware/vmware-go-kcl/&quot;&gt;how Baker works with Kinesis&lt;/a&gt;.&lt;/li&gt;
  3573. &lt;/ul&gt;
  3574.  
  3575. &lt;p&gt;&lt;strong&gt;The final saving was 85% compared to the cost of the previous version&lt;/strong&gt;, a great result.&lt;br /&gt;
  3576. We have learned a lot from this experience and we are now replaying it on other similar services.&lt;/p&gt;
  3577. </description>
  3578.    </item>
  3579.    
  3580.    
  3581.    
  3582.    <item>
  3583.      <title>How I became the first Latina remote engineer</title>
  3584.      <link>https://tech.nextroll.com/blog/careers/2020/04/21/first-latina-remote-engineer.html</link>
  3585.      <pubDate>Tue, 21 Apr 2020 00:00:00 -0700</pubDate>
  3586.      <author></author>
  3587.      <guid isPermaLink="false">https://tech.nextroll.com/blog/careers/2020/04/21/first-latina-remote-engineer</guid>
  3588.      <description>&lt;p&gt;It’s been 2 years working remotely as a Software Engineer at NextRoll, a tech company in Silicon Valley. When I started the journey to find a new job, I’ve never aimed to be working in the U.S. center of innovative technology companies. I didn’t think I could be the first Latina working remotely in one of these companies either. I was simply motivated to be open to new challenges.&lt;/p&gt;
  3589.  
  3590. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;25 minute read&lt;/code&gt;&lt;/p&gt;
  3591.  
  3592. &lt;hr /&gt;
  3593.  
  3594. &lt;h2 id=&quot;what-if&quot;&gt;“What if?”&lt;/h2&gt;
  3595.  
  3596. &lt;p&gt;Since the beginning of my career, I’ve always worked for people that knew me before, or I’ve applied to positions that a friend was recommending. I never said to myself “now it’s time to get the job you want”. Since then, the perfect job to me is one where I can consider the team members as my family, too. After all, we spend hours in the office. I had the luck to have many families in my professional life.&lt;/p&gt;
  3597.  
  3598. &lt;p&gt;Once in the past, a friend told me he would start to work from home for a company in the U.S., and that story sounded pretty crazy to me. We live in Brazil, why someone would have interests to contract people so far away? What about the language? You have to convince interviewers you are good enough for some position speaking another language. That wasn’t for me. I had barely traveled abroad at that time to try my English skills that I’ve learned in a classroom. Even though my skeptical ideas, I felt like that was the wonderland of jobs that I would never be part of, a completely new and wonderful world. That was my first “What if?” moment.&lt;/p&gt;
  3599.  
  3600. &lt;p&gt;Years later, a second friend asked me if I would like to go to Ireland. He and his wife had applied to jobs there and she was accepted. He would finish his last interviews after moving. I answered back immediately: “Why not?”. What if I try the same? I was working for the same company for 8 years, the contracts were ending, the new contracts were paying lower salaries and the remaining people would have to fit the way of work of the new contracts. As an Agilist, I knew that I wouldn’t fit that job anymore, all the agile culture our team had built before wouldn’t be a priority in the new era. From that moment on, I decided to look for the job I really wanted.&lt;/p&gt;
  3601.  
  3602. &lt;p&gt;When we start to ask questions, things begin to happen. Combined “What if?” and “Why not?” questions are the trigger to make us overcome periods of inertia in our lives and achieve our goals. You start to change the way you think and many possibilities present themselves when you keep your mind open.&lt;/p&gt;
  3603.  
  3604. &lt;p&gt;&lt;img src=&quot;/images/post_images/many_directions.jpg&quot; alt=&quot;open_sourced_projects&quot; /&gt;&lt;/p&gt;
  3605.  
  3606. &lt;h2 id=&quot;making-things-happen&quot;&gt;Making things happen&lt;/h2&gt;
  3607.  
  3608. &lt;p&gt;After the popularization of the Web, the way people look for a job has changed a lot. I remember when I needed an internship to get my data processing certificate. My father had to drive me to a place that use to concentrate all the open internship positions in the town, a wall full of papers describing these positions. I was 16 years old, all the options I’ve seen on that wall were requiring men. I’ve never forgotten my father’s face realizing that his daughter wouldn’t have a chance in that place. Anyway, I’ve never had to look at a wall like that anymore.&lt;/p&gt;
  3609.  
  3610. &lt;p&gt;The motivation to keep going in this journey was coming from a requirement list I’ve made based on what I wanted for my future:&lt;/p&gt;
  3611. &lt;ol&gt;
  3612.  &lt;li&gt;I wanted to have an &lt;strong&gt;international experience&lt;/strong&gt;, to work with people from different countries and to live abroad for a while.&lt;/li&gt;
  3613.  &lt;li&gt;I’ve been working with &lt;strong&gt;agile methodologies&lt;/strong&gt; since ever, a company with an agile culture suited me well.&lt;/li&gt;
  3614.  &lt;li&gt;I couldn’t wait for &lt;strong&gt;new challenges&lt;/strong&gt;, I wanted to try new programming languages, cloud solutions and microservices.&lt;/li&gt;
  3615.  &lt;li&gt;I wanted to &lt;strong&gt;improve my English skills&lt;/strong&gt;.&lt;/li&gt;
  3616. &lt;/ol&gt;
  3617.  
  3618. &lt;p&gt;Virtual life came to make things much easier, but more competitive too. The first step was creating my online CV. I reviewed the description of my professional experiences countless times. It’s not forbidden to spy other online profiles to have some ideas. I’ve learned that it’s important to be specific describing your experiences, it cannot be too short. I used to write on my CV only some topics and technologies I’ve used and that is not enough if you want to find an international opportunity. The first call I’ve answered was a recruiter asking me to be more detailed, that 5 minutes call helped me a lot. These are some things I learned on how to use the Web in your favor:&lt;/p&gt;
  3619.  
  3620. &lt;blockquote&gt;
  3621.  &lt;ul&gt;
  3622.    &lt;li&gt;Emphasize those aspects that will help you to get the job you want and hide those to avoid what you don’t want.&lt;/li&gt;
  3623.    &lt;li&gt;There are many tools and social networks available aiming the professional life and it’s a good practice to keep your personal life out of this. Connect to recruiters in these networks.&lt;/li&gt;
  3624.    &lt;li&gt;Post and share those articles you get interested in, it helps to understand and know you better.&lt;/li&gt;
  3625.    &lt;li&gt;Try some online courses or study to get a certification.&lt;/li&gt;
  3626.    &lt;li&gt;Keep an online portfolio. It can be a simple code you’ve done as a test, a course exercise or an open-source contribution.&lt;/li&gt;
  3627.    &lt;li&gt;Practice your algorithm skills to be prepared for practical tests, there are tools online to do that too.&lt;/li&gt;
  3628.  &lt;/ul&gt;
  3629. &lt;/blockquote&gt;
  3630.  
  3631. &lt;h2 id=&quot;dealing-with-rejection&quot;&gt;Dealing with rejection&lt;/h2&gt;
  3632.  
  3633. &lt;p&gt;In the beginning, I was focusing on getting a job in Ireland, so I was applying only to positions there. Soon after, I realized that many companies don’t contract people outside Europe, it’s expensive. And then I started to think that I could check in any place in Europe to expand my chances. As I had made my online work right, recruiters began to find me. That’s why it is so important to give as much information as you can in your profile. I’ve attended some interviews and it helped me to improve my English skills during the process.&lt;/p&gt;
  3634.  
  3635. &lt;p&gt;After repeating the story about your experiences, you feel more confident for a while. Each time you tell the story again, it’s a chance to make it better. However, after hearing some No’s, your confidence changes. You start to feel like a fraud. Your mind begins playing tricks with yourself, making you doubt your competence. I’ve repeated so many times the same story that I wasn’t sure what was real anymore. I’ve found out then that this phenomenon has a name, it’s the Impostor syndrome. Indeed, I was feeling like an impostor. Was I expecting too much of a job? Was I prepared for that?&lt;/p&gt;
  3636.  
  3637. &lt;p&gt;While I was in those circumstances, my contract had ended. I was unemployed and I’ve got pregnant. I felt like I was 18 years old and my future was compromised. I was not willing to remove any item from my requirement list. So I changed the perspective: I could keep the international experience without living abroad. I added a fifth item: &lt;strong&gt;to work remotely&lt;/strong&gt;. I even convinced another friend, another woman in tech, that she should start to search for remote jobs. Six months later, she was working from home. I was thrilled that I could plant that “What if?” in her mind.&lt;/p&gt;
  3638.  
  3639. &lt;p&gt;My situation had changed, I needed a break. I was full of concerns on how to become a mom, how to start a new job being pregnant, how to work having a newborn besides some existential crises. I’ve seen in this period dozens of work colleagues going to live abroad after a few interviews. And I had to prepare myself for the most challenging task in my life. The job search was put on standby.&lt;/p&gt;
  3640.  
  3641. &lt;p&gt;When I returned to my job search, and I started to attend some interviews again, I had a gap of 2 years in my professional life. I’ve been working for 15 years in software development, but the interviewers were worried about those last 2 years. It wasn’t enough to them saying that I was studying during that time, I was updating myself trying new technologies, I became an undesirable candidate. I even heard the question “Who will take care of your baby while you are at work?”. It sounded like it would be impossible to work again after all I had done. That requirement list of the perfect job didn’t matter anymore, I had to concentrate my efforts to get a job.&lt;/p&gt;
  3642.  
  3643. &lt;p&gt;I still don’t have a tip for people in the same situation of having an empty lapse of time on their CVs. I’ve always told the truth and I had to live with the consequences. Fortunately, some interviewers evaluate your capacity for growth in the company. In all of the following experiences I describe in the next section, I didn’t hear the question “What were you doing in the last 2 years?”.&lt;/p&gt;
  3644.  
  3645. &lt;h2 id=&quot;point-of-seeing-the-results&quot;&gt;Point of seeing the results&lt;/h2&gt;
  3646.  
  3647. &lt;p&gt;Looking for remote jobs, I found out that most offers come from the U.S. and most of them only accept applications from U.S. citizens. I was doing well in a selection process and I couldn’t go ahead because of the bureaucracy of contracting a foreigner. Each country has its labor laws and it can be complicated to deal with it for many companies.&lt;/p&gt;
  3648.  
  3649. &lt;p&gt;I applied to a position at a company in the south of Brazil, but it would require relocation to another city. It was an amazing opportunity, after many steps in the selection process they offered me the job. What they were offering me, in terms of the culture of the company and technologies, totally fit my ideas for my new job except for not being an international experience. It was really hard to say “no” to them, after many calls and texts messages, I realized that my family wasn’t prepared for that relocation. I’m grateful for that offer, I stopped feeling like an impostor.&lt;/p&gt;
  3650.  
  3651. &lt;p&gt;In the meantime, I was referred to a position in a DevOps team. The company didn’t have an agile culture, but it was starting to put in place some agile techniques. It was a huge challenge, they needed to apply automation in all main company systems and I had the required experience to work on that. I was happy to start working again, but I felt like I had failed with my initial goals. That was a really important opportunity in many ways, I had my confidence again.&lt;/p&gt;
  3652.  
  3653. &lt;p&gt;One month after I accepted that regular job, I answered an unexpected call. People say that we don’t find the best opportunities in job lists. Usually, they find you. After some interviews with the DC team in NextRoll, they offered me a remote job. I didn’t know too much about the company at that time and I didn’t know what to expect, but this was the international experience I was pursuing so long. I was afraid to quit my regular job, I was afraid of an unknown field, I almost said “no”, but I didn’t. On my first day in the company, I found out that it suited all 5 requirements on my list: NextRoll has an amazing agile and diverse culture and it’s committed to the best solutions.&lt;/p&gt;
  3654.  
  3655. &lt;p&gt;I’ve taken some time to realize the importance of my achievements. I was in the center of technology in the U.S. and I was the first Latina working remotely as a Software Engineer at NextRoll, a Silicon Valley company. If I wanted new challenges I was in the right place, not because the office was in San Franciso, but the complexity of the solutions I was facing was impressive. Until now, I had the opportunity to write code in Java, Python, Javascript, Ruby, Golang, and Erlang. I also have to read data from SQL and NoSQL databases. The systems deal with a high volume of data in the cloud, I could explore many AWS services. All of these systems are part of a service architecture that depends on different teams to be maintained. It means that we always have to be in contact with different people in the company, in a different part of the world and from a different culture. Every day is a day to learn something new and that’s why I’ve chosen this profession, you never get bored.&lt;/p&gt;
  3656.  
  3657. &lt;h1&gt;&lt;img src=&quot;/images/post_images/desk.jpg&quot; alt=&quot;open_sourced_projects&quot; /&gt;&lt;/h1&gt;
  3658.  
  3659. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  3660.  
  3661. &lt;p&gt;I wanted to share this experience to motivate people to chase their goals and to put into words some difficulties that stood in my way. Usually, we share the successful results we had, but we forget the bad moments and hard decisions we’ve made that guide us there. I believe it can help someone else that is facing some setbacks and need that extra push to keep going.&lt;/p&gt;
  3662.  
  3663. &lt;p&gt;Finally, I’d like to inspire other women in tech and to remind us again of the importance of our stories.&lt;/p&gt;
  3664. </description>
  3665.    </item>
  3666.    
  3667.    
  3668.    
  3669.    <item>
  3670.      <title>Squeezing the most out of the server: Erlang Profiling</title>
  3671.      <link>https://tech.nextroll.com/blog/dev/2020/04/07/erlang-profiling.html</link>
  3672.      <pubDate>Tue, 07 Apr 2020 00:00:00 -0700</pubDate>
  3673.      <author></author>
  3674.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2020/04/07/erlang-profiling</guid>
  3675.      <description>&lt;p&gt;NextRoll’s &lt;a href=&quot;http://tech.nextroll.com/blog/dev/2018/01/08/quaff-that-potion-saving-millions-with-elixir-and-erlang.html#real-time-bidding&quot;&gt;real-time bidding&lt;/a&gt; (RTB) platform has been featured several times in this tech blog: we run a fleet of Erlang applications (the bidders) that typically ranges between one and two
  3676. thousand nodes. As described in &lt;a href=&quot;http://tech.nextroll.com/blog/dev/ops/2018/10/15/x-marks-the-spot.html&quot;&gt;past articles&lt;/a&gt;, an ongoing goal of the RTB
  3677. team —as well as a source of interesting technical problems— is to minimize operational costs
  3678. as much as possible.&lt;/p&gt;
  3679.  
  3680. &lt;p&gt;An obvious way to reduce costs is to make the system more efficient and this
  3681. means entering the &lt;a href=&quot;http://wiki.c2.com/?RulesOfOptimization&quot;&gt;hazardous land of software optimization&lt;/a&gt;. Even for experienced programmers,
  3682. identifying bottlenecks is a hard enough problem when using the right tools;
  3683. trying to guess what could make the code run faster will not only waste time
  3684. but is likely to introduce unnecessary complexity that can cause problems down the line. The cousin of
  3685. &lt;em&gt;premature&lt;/em&gt; optimization is &lt;em&gt;necessary&lt;/em&gt; optimization &lt;a href=&quot;http://wiki.c2.com/?ProfileBeforeOptimizing&quot;&gt;without profiling first&lt;/a&gt;&lt;/p&gt;
  3686.  
  3687. &lt;p&gt;While Erlang is famously known for its concurrency model and fault-tolerant design,
  3688. one of its biggest strengths is the level of live inspection and tuning it offers,
  3689. often with little or no setup and runtime cost. In this article, we outline
  3690. how we leverage those features to profile our system, driving the optimizations that
  3691. can lead to cost reductions.&lt;/p&gt;
  3692.  
  3693. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10-15 minute read&lt;/code&gt;&lt;/p&gt;
  3694.  
  3695. &lt;hr /&gt;
  3696.  
  3697. &lt;h2 id=&quot;infrastructure&quot;&gt;Infrastructure&lt;/h2&gt;
  3698. &lt;p&gt;An interesting aspect of real-time bidding is that it is fairly low-risk to test in production. Even if the new code is slow or contains errors, the bidders are architected to just send a no-bid response whenever a request can’t be fulfilled.&lt;/p&gt;
  3699.  
  3700. &lt;p&gt;Taking advantage of this, we incorporate canary deploys to our day-to-day development workflow. In the context of optimizations, this means we can quickly test our performance hypothesis by updating the code and testing with live traffic. We have metrics and dashboards that give feedback on common metrics like timeouts, errors and amount of requests processed, making it obvious when a change is beneficial.&lt;/p&gt;
  3701.  
  3702. &lt;h2 id=&quot;request-level-timers&quot;&gt;Request-level timers&lt;/h2&gt;
  3703.  
  3704. &lt;p&gt;Bid request processing is the fundamental operation of the bidder application. Any improvement in the amount of time it
  3705. takes us to send a response to an ad exchange means we can process more requests per server, requiring fewer servers to handle
  3706. our traffic and ultimately saving us money.&lt;/p&gt;
  3707.  
  3708. &lt;p&gt;The work involved in a bid request can be broadly divided into a series of tasks such as payload parsing, selection of matching ads, pricing a particular ad, and response encoding.
  3709. A common practice is to periodically measure the time invested in each of those phases to make sure they don’t degrade over time and to help provide a frame of
  3710. reference to use when we look for areas of the code that are worthy of our optimization efforts.&lt;/p&gt;
  3711.  
  3712. &lt;center&gt;
  3713.    &lt;img alt=&quot;A sample of bid request timings per phase&quot; src=&quot;/images/post_images/erlang_requests.png&quot; /&gt;&lt;br /&gt;
  3714.    &lt;i&gt;A sample of bid request timings per phase.&lt;/i&gt;
  3715.    &lt;p&gt;&lt;/p&gt;
  3716. &lt;/center&gt;
  3717.  
  3718. &lt;p&gt;The most basic way of profiling consists of timing the calls to
  3719. a given piece of the code (perhaps one that was deemed suspicious by one of the methods described in the next sections):&lt;/p&gt;
  3720.  
  3721. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;c&quot;&gt;%% Evaluates Fun() and reports its evaluation time as a histogram
  3722. %% metric.
  3723. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;spec&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;time_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  3724. &lt;span class=&quot;nf&quot;&gt;time_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  3725.    &lt;span class=&quot;nv&quot;&gt;Start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;erlang&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;system_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;microsecond&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  3726.    &lt;span class=&quot;nv&quot;&gt;Result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
  3727.    &lt;span class=&quot;nv&quot;&gt;End&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;erlang&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;system_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;microsecond&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  3728.    &lt;span class=&quot;nv&quot;&gt;Diff&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;End&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3729.    &lt;span class=&quot;nf&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;histogram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Diff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  3730.    &lt;span class=&quot;nv&quot;&gt;Result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  3731.  
  3732. &lt;p&gt;The above helper is used to wrap calls to the function we want to measure.
  3733. A canary deploy of the timed code to production will generate average, median and percentile metrics that we can then compare to the overall request time to identify bottlenecks.&lt;/p&gt;
  3734.  
  3735. &lt;h2 id=&quot;recon&quot;&gt;recon&lt;/h2&gt;
  3736. &lt;p&gt;Timing request operations can be a very useful technique for understanding the specifics of a request flow but gives us a limited perspective of the entire system. Most of the bid request phases are handled on a single process and some of them involve idle time waiting for external systems. There are many periodic tasks and long-lived support processes in the bidders, and we can benefit from system-wide profiling that looks beyond bid request processing. This is where the Erlang toolbox comes into play.&lt;/p&gt;
  3737.  
  3738. &lt;p&gt;The first valuable resource is not a piece of software but a little book by &lt;a href=&quot;https://twitter.com/mononcqc&quot;&gt;Fred Hebert&lt;/a&gt;: &lt;a href=&quot;https://www.erlang-in-anger.com/&quot;&gt;Erlang in Anger&lt;/a&gt;. This guide is the perfect reference because it describes methods for gaining insight and optimizing production systems, backed by real-world experience.
  3739. The book’s companion library, &lt;a href=&quot;http://ferd.github.io/recon/&quot;&gt;recon&lt;/a&gt;, provides safer, friendlier and more productive interfaces to
  3740. Erlang’s powerful inspection tools. What follows are some simple examples mostly derived from the book.&lt;/p&gt;
  3741.  
  3742. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;c&quot;&gt;%% get node general stats
  3743. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;recon&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;node_stats_print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  3744. &lt;span class=&quot;p&quot;&gt;{[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;process_count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3061&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3745.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_queue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;31&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3746.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;error_logger_queue_len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3747.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory_total&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6141339952&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3748.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory_procs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2623297976&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3749.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory_atoms&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;967320&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3750.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory_bin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;610903880&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3751.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory_ets&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2078839480&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}],&lt;/span&gt;
  3752. &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bytes_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;729538&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3753.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bytes_out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2722704&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3754.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gc_count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9475&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3755.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gc_words_reclaimed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;28611695&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3756.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reductions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;699610880&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3757.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;scheduler_usage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9840756659394524&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3758.                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9760797698171886&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3759.                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9891411385108696&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3760.                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9817415742856814&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3761.                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;985099143110567&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3762.                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9862095293650041&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3763.                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9759060081154223&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3764.                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9926624828687856&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]}]}&lt;/span&gt;
  3765.  
  3766. &lt;span class=&quot;c&quot;&gt;%% get top 3 memory consumers
  3767. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;recon&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;proc_count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  3768. &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1597&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;986767172&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3769.  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_banker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3770.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gen_server&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3771.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}]},&lt;/span&gt;
  3772. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;344&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;915466692&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3773.  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bid_guardian&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3774.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gen_server&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3775.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}]},&lt;/span&gt;
  3776. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;15784&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2618&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;41052056&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3777.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prim_inet&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3778.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}]}]&lt;/span&gt;
  3779.  
  3780. &lt;span class=&quot;c&quot;&gt;%% get top 3 cpu consumers
  3781. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;recon&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;proc_count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reductions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  3782. &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7011&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2464069696&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3783.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exometer_probe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3784.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}]},&lt;/span&gt;
  3785. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6730&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2422804659&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3786.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exometer_probe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3787.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}]},&lt;/span&gt;
  3788. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6740&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2402511307&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3789.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exometer_probe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3790.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}]}]&lt;/span&gt;
  3791.  
  3792. &lt;span class=&quot;c&quot;&gt;%% get top cpu consumers during a time window
  3793. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;recon&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;proc_window&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reductions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  3794. &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20046&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2631&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;688443&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3795.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;views_bid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3796.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}]},&lt;/span&gt;
  3797. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;29341&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2629&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;547335&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3798.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sleep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3799.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}]},&lt;/span&gt;
  3800. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7687&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2632&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;386187&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3801.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prim_inet&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3802.   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}]}]&lt;/span&gt;
  3803.  
  3804. &lt;span class=&quot;c&quot;&gt;%% get process stats, including stacktrace
  3805. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;recon&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1368&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  3806. &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;meta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;registered_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_banker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3807.        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dictionary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;&apos;$initial_call&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_banker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3808.                     &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;&apos;$ancestors&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_sup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;340&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}]},&lt;/span&gt;
  3809.        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;group_leader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;339&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3810.        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;status&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;running&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3811. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;signals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;links&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;342&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
  3812.           &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;monitors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[]},&lt;/span&gt;
  3813.           &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;monitored_by&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[]},&lt;/span&gt;
  3814.           &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;trap_exit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3815. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;location&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3816.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_stacktrace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_banker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;&apos;-handle_info/2-lc$^0/1-0-&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3817.                                                       &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;/bidder/src/pacing/bidder_banker.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3818.                                                        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;337&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3819.                                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_banker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;&apos;-handle_info/2-lc$^0/1-0-&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3820.                                                       &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;/bidder/src/pacing/bidder_banker.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3821.                                                        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;337&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3822.                                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_banker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;&apos;-handle_info/2-lc$^0/1-0-&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3823.                                                       &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;/bidder/src/pacing/bidder_banker.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3824.                                                        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;337&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3825.                                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_banker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;handle_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3826.                                                       &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;/bidder/src/pacing/bidder_banker.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3827.                                                        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;337&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3828.                                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gen_server&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;try_dispatch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3829.                                             &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;gen_server.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;637&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3830.                                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gen_server&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;handle_msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3831.                                             &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;gen_server.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;711&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3832.                                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p_do_apply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3833.                                           &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;proc_lib.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;249&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]}]}]},&lt;/span&gt;
  3834. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory_used&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[{&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;916065740&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3835.               &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message_queue_len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3836.               &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;heap_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;75113&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3837.               &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_heap_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;114508086&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3838.               &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;garbage_collection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;max_heap_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;error_logger&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3839.                                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;min_bin_vheap_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;46422&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3840.                                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;min_heap_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;233&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3841.                                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fullsweep_after&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;65535&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3842.                                    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minor_gcs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]}]},&lt;/span&gt;
  3843. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;work&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reductions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1961022577&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]}]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  3844.  
  3845. &lt;p&gt;As suggested by the book, a good method is to run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recon:proc_window&lt;/code&gt; repeatedly and try to identify patterns, e.g. a process that frequently ranks among the top CPU consumers.
  3846. The process id can then be passed to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recon:info&lt;/code&gt; to get useful information (such as the stacktrace) in order to understand what the process is doing.
  3847. Using this method we quickly found that a commonly accessed data structure contained ~100kb of debug data which was being copied thousands of times per second.&lt;/p&gt;
  3848.  
  3849. &lt;p&gt;This approach, though, will tend to highlight long-running busy processes over short-lived ones (that could be called a lot and account for bigger overhead overall).
  3850. This can be partially overcome by running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;proc_window&lt;/code&gt; repeatedly and aggregating the results by location rather than process id. However, there are better tools to look at aggregated process times.&lt;/p&gt;
  3851.  
  3852. &lt;h2 id=&quot;redbug&quot;&gt;redbug&lt;/h2&gt;
  3853. &lt;p&gt;Strictly speaking, &lt;a href=&quot;https://github.com/massemanet/redbug&quot;&gt;redbug&lt;/a&gt; is not a profiling tool, but it’s so useful for debugging live systems that it deserves a mention in this article.
  3854. It allows you to safely trace functions from the shell in a very intuitive yet sophisticated way (as opposed to the rougher Erlang built-in tracing modules).
  3855. It can be very handy to get a quick notion of how functions are being called, how frequently, and what production data looks like:&lt;/p&gt;
  3856.  
  3857. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;redbug&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;jiffy:encode/1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  3858. &lt;span class=&quot;c&quot;&gt;% 00:11:03 &amp;lt;0.18718.7151&amp;gt;({erlang,apply,2})
  3859. % jiffy:encode(1)
  3860. &lt;/span&gt;
  3861. &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;redbug&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;jiffy:encode-&amp;gt;return&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  3862. &lt;span class=&quot;c&quot;&gt;% 00:11:03 &amp;lt;0.18718.7151&amp;gt;({erlang,apply,2})
  3863. % jiffy:encode([&amp;lt;&amp;lt;&quot;USD&quot;&amp;gt;&amp;gt;])
  3864. % 00:11:03 &amp;lt;0.18718.7151&amp;gt;({erlang,apply,2})
  3865. % jiffy:encode/1 -&amp;gt; &amp;lt;&amp;lt;&quot;[\&quot;USD\&quot;]&quot;&amp;gt;&amp;gt;
  3866. &lt;/span&gt;
  3867. &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;redbug&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;jiffy:encode([N]) when is_integer(N)-&amp;gt;return&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  3868. &lt;span class=&quot;c&quot;&gt;% 00:12:20 &amp;lt;0.1535.7173&amp;gt;({erlang,apply,2})
  3869. % jiffy:encode([300])
  3870. % 00:12:20 &amp;lt;0.1535.7173&amp;gt;({erlang,apply,2})
  3871. &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;jiffy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;encode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;[300]&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  3872.  
  3873. &lt;p&gt;Including recon and redbug in an Erlang application release has no cost and can be a real life-saver
  3874. when diagnosing production issues.  These tools promote a flow that’s more powerful
  3875. than adding prints to the code and feels more natural than step debuggers
  3876. which wouldn’t be that useful in a highly concurrent world anyway.&lt;/p&gt;
  3877.  
  3878. &lt;h2 id=&quot;erlang-easy-profiling-eep&quot;&gt;Erlang Easy Profiling (eep)&lt;/h2&gt;
  3879.  
  3880. &lt;p&gt;&lt;a href=&quot;https://github.com/virtan/eep&quot;&gt;eep&lt;/a&gt; allows for a more “traditional” approach to profiling by using Erlang tracing to take a snapshot of the system operation with function call counts, execution times and inter-dependencies.&lt;/p&gt;
  3881.  
  3882. &lt;p&gt;It requires a bit more effort to use and it is not as safe as the rest of the tools described in the article.
  3883. It will slow down the system potentially even killing it if used carelessly. Its output file can eat up a lot of disk space (a 100 ms snapshot takes about 300mb for our system).
  3884. Depending on the nature of the application, it may not make sense to run it directly in production.&lt;/p&gt;
  3885.  
  3886. &lt;p&gt;Here’s an example tracing session using eep:&lt;/p&gt;
  3887.  
  3888. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;eep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;start_file_tracing&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;file_name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  3889.    &lt;span class=&quot;nn&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sleep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  3890.    &lt;span class=&quot;nn&quot;&gt;eep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;stop_tracing&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  3891.  
  3892. &lt;p&gt;Note that we start, sleep and stop tracing in the same line. Don’t rely on the shell being responsive during tracing! You could send a message or call a function in between, as well, to force a certain part of the code to be executed while taking the snapshot.
  3893. The instructions above will output a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;file_name.trace&lt;/code&gt; file in the release directory. The file then needs to be moved out of the production server and processed on a local Erlang shell:&lt;/p&gt;
  3894.  
  3895. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;local&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;eep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;convert_tracing&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;file_name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  3896. &lt;span class=&quot;n&quot;&gt;working&lt;/span&gt;
  3897. &lt;span class=&quot;mi&quot;&gt;38436&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;msgs&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;38366&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msgs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;106996&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;secs&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;slowdown&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  3898. &lt;span class=&quot;mi&quot;&gt;39367&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;msgs&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;37807&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msgs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;106996&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;secs&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;slowdown&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  3899. &lt;span class=&quot;n&quot;&gt;done&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  3900.  
  3901. &lt;p&gt;This will, in turn, produce a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;callgrind.out.file_name&lt;/code&gt; that can be input to &lt;a href=&quot;http://kcachegrind.sourceforge.net/html/Home.html&quot;&gt;Kcachegrind&lt;/a&gt; (Qcachegrind in macOS). Note that by default the tracing will discriminate entries per process id. This would yield a similar situation to what we saw with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recon:proc_window&lt;/code&gt;. A more interesting view is to merge function calls of all processes, which can be accomplished by stripping the pid:&lt;/p&gt;
  3902.  
  3903. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ grep -v &quot;^ob=&quot; callgrind.out.file_name &amp;gt; callgrind.out.merged_file_name
  3904. $ qcachegrind callgrind.out.merged_file_name
  3905. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  3906.  
  3907. &lt;p&gt;QCachegrind presents the snapshot in a very sophisticated UI that can be used to spot the most frequently called functions, where most time is spent, etc.&lt;/p&gt;
  3908.  
  3909. &lt;center&gt;
  3910.    &lt;img alt=&quot;eep output to QCachegrind&quot; src=&quot;/images/post_images/erlang_qcachegrind.png&quot; /&gt;&lt;br /&gt;
  3911.    &lt;i&gt;eep output to QCachegrind.&lt;/i&gt;
  3912.    &lt;p&gt;&lt;/p&gt;
  3913. &lt;/center&gt;
  3914.  
  3915. &lt;p&gt;Since eep is based on Erlang tracing, it will add an overhead to all code and may comparatively
  3916. misrepresent the work done by Built-in functions (BIFs) and Native-implemented functions (NIFs),
  3917. so the timings shown in the snapshot need to be taken with a grain of salt. Nevertheless, it is
  3918. still a great exploratory tool to understand how the different components of the system interact,
  3919. how dependencies are used, and learn about obscure or suspicious areas that can be hard to spot by just looking at the code.&lt;/p&gt;
  3920.  
  3921. &lt;p&gt;Note that there are other Erlang profiling libraries (which we haven’t tried yet), that produce
  3922. callgrind output: &lt;a href=&quot;https://github.com/isacssouza/erlgrind&quot;&gt;erlgrind&lt;/a&gt; and
  3923. &lt;a href=&quot;https://github.com/rabbitmq/looking_glass&quot;&gt;looking_glass&lt;/a&gt;.&lt;/p&gt;
  3924.  
  3925. &lt;h2 id=&quot;erlangsystem_monitor&quot;&gt;erlang:system_monitor&lt;/h2&gt;
  3926.  
  3927. &lt;p&gt;Yet another way of looking at the application is the &lt;a href=&quot;http://erlang.org/doc/man/erlang.html#system_monitor-2&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erlang:system_monitor/2&lt;/code&gt;&lt;/a&gt; BIF.
  3928. It allows you to set up a process to receive a message every time a certain condition is met. It was particularly helpful for us
  3929. when examining long garbage collections and schedules of long duration, the latter of which can surface issues with NIFs that
  3930. would go unnoticed with other methods.&lt;/p&gt;
  3931.  
  3932. &lt;p&gt;Here’s an example of its use in the shell (based off a snippet from &lt;em&gt;Erlang in Anger&lt;/em&gt;):&lt;/p&gt;
  3933.  
  3934. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Loop&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  3935.               &lt;span class=&quot;k&quot;&gt;receive&lt;/span&gt;
  3936.                   &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;monitor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Pid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  3937.                       &lt;span class=&quot;nv&quot;&gt;ReconLocation&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;recon&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Pid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;location&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  3938.                       &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;monitor=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; pid=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; info=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; recon=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3939.                       &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Pid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ReconLocation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  3940.               &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3941.               &lt;span class=&quot;nv&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  3942.       &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  3943.  
  3944. &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_dev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;spawn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  3945.                  &lt;span class=&quot;nb&quot;&gt;register&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temp_sys_monitor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()),&lt;/span&gt;
  3946.                  &lt;span class=&quot;nn&quot;&gt;erlang&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;system_monitor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;long_schedule&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;long_gc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]),&lt;/span&gt;
  3947.                  &lt;span class=&quot;nv&quot;&gt;Loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  3948.      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  3949.      &lt;span class=&quot;nn&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sleep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  3950.      &lt;span class=&quot;nb&quot;&gt;exit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;whereis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temp_sys_monitor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  3951.  
  3952. &lt;p&gt;This will monitor the system for 10 seconds and output process information to the shell every time there’s a garbage collection or schedule that takes more than 30ms:&lt;/p&gt;
  3953.  
  3954. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;nb&quot;&gt;monitor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;long_gc&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pid&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7011&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2176&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5384&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
  3955. &lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timeout&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;33&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3956.       &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;old_heap_block_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2984878&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3957.       &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;heap_block_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;514838&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3958.       &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mbuf_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3959.       &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stack_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;35&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3960.       &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;old_heap_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1535726&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  3961.       &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;heap_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;51190&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]&lt;/span&gt;
  3962. &lt;span class=&quot;n&quot;&gt;recon&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;location&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3963.           &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
  3964.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_stacktrace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3965.                &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_stat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3966.                     &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/bidder/src/bidder_stat.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;202&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3967.                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;views_bid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3968.                     &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/bidder/src/views/views_bid.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;79&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3969.                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;views_bid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bid_request&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3970.                     &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/bidder/src/views/views_bid.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;17&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3971.                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;erl_stat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3972.                     &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/bidder/src/bidder_stat.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;42&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3973.                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidder_web_handler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dispatch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3974.                     &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/bidder/src/bidder_web_handler.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;211&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  3975.                 &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_p_do_apply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  3976.                     &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;proc_lib.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;249&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]}]}]}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  3977.  
  3978. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  3979.  
  3980. &lt;p&gt;This article is by no means an exhaustive list of Erlang diagnosing tools; there’s &lt;a href=&quot;https://erlang.org/doc/man/observer.html&quot;&gt;the
  3981. observer&lt;/a&gt;, &lt;a href=&quot;https://erlang.org/doc/man/eprof.html&quot;&gt;eprof&lt;/a&gt;, &lt;a href=&quot;https://erlang.org/doc/man/fprof.html&quot;&gt;fprof&lt;/a&gt;, &lt;a href=&quot;https://github.com/proger/eflame&quot;&gt;eflame&lt;/a&gt;, &lt;a href=&quot;https://github.com/jlouis/eministat&quot;&gt;eministat&lt;/a&gt; and the list goes on. The Erlang documentation
  3982. itself has a nice &lt;a href=&quot;https://erlang.org/doc/efficiency_guide/users_guide.html&quot;&gt;efficiency guide&lt;/a&gt; with an overview of the &lt;a href=&quot;http://erlang.org/doc/efficiency_guide/profiling.html&quot;&gt;built-in profiling modules&lt;/a&gt;.&lt;/p&gt;
  3983.  
  3984. &lt;p&gt;Since we started with this effort, we consistently reduced request times (and operational costs), month over month.
  3985. To a large extent these gains came thanks to the advanced tools Erlang and its ecosystem have to offer.
  3986. What’s most interesting is that we achieved this by getting to know our systems better,
  3987. fixing bugs, and often removing rather than adding specialized code.&lt;/p&gt;
  3988.  
  3989. &lt;center&gt;
  3990.    &lt;img alt=&quot;Make it beautiful&quot; src=&quot;/images/post_images/erlang_joe.jpg&quot; /&gt;
  3991. &lt;/center&gt;
  3992. </description>
  3993.    </item>
  3994.    
  3995.    
  3996.    
  3997.    <item>
  3998.      <title>
  3999. Women in Tech Spotlight: Larissa Licha
  4000. </title>
  4001.      <link>https://tech.nextroll.com/blog/women_in_tech/2020/03/31/women-in-tech-spotlight-larissa.html</link>
  4002.      <pubDate>Tue, 31 Mar 2020 00:00:00 -0700</pubDate>
  4003.      <author></author>
  4004.      <guid isPermaLink="false">https://tech.nextroll.com/blog/women_in_tech/2020/03/31/women-in-tech-spotlight-larissa</guid>
  4005.      <description>&lt;p&gt;Women in Tech - Spotlight: Larissa Licha&lt;/p&gt;
  4006.  
  4007. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;6 minute read&lt;/code&gt;&lt;/p&gt;
  4008.  
  4009. &lt;hr /&gt;
  4010. &lt;p&gt;March is Women’s History Month and in honor of this occasion, NextRoll is launching a month-long Women in Tech Spotlight – where we highlight a few of the women here at NextRoll. This will be a limited three-part series of interviews, showcasing three of our extraordinary women in technology. This week, I had the pleasure of interviewing Larissa Licha!&lt;/p&gt;
  4011.  
  4012. &lt;h2 id=&quot;larissa-licha&quot;&gt;Larissa Licha&lt;/h2&gt;
  4013. &lt;h3 id=&quot;title-director-of-product-platform-services&quot;&gt;Title: Director of Product, Platform Services&lt;/h3&gt;
  4014.  
  4015. &lt;p&gt;&lt;img src=&quot;/images/post_images/larissa.jpg&quot; alt=&quot;open_sourced_projects&quot; /&gt;&lt;/p&gt;
  4016.  
  4017. &lt;h2 id=&quot;can-you-share-a-little-bit-about-what-it-is-that-you-do-and-what-a-typical-day-for-you-is-like&quot;&gt;Can you share a little bit about what it is that you do and what a typical day for you is like?&lt;/h2&gt;
  4018.  
  4019. &lt;p&gt;I am the Director of Product Management for Platform Services, our newest business unit. My focus is heavily tied to strategy and the overarching vision for the business unit and ensuring that product and development execute towards that vision. I also work closely across departments such as Business Development, Finance, Legal and end customers. On a typical day, I have to context switch a lot between working with engineers to make sure we get key tickets done while prioritizing the right tasks (our priorities change a lot), to delegating across cross-functional teams or impacted engineering teams, to talking to the end customer ensuring we meet their needs, keep them in the loop on progress or key updates, to engaging with new prospects, to monitoring revenue numbers and help forecasting, and staying on top of what’s going on in the industry.&lt;/p&gt;
  4020.  
  4021. &lt;p&gt;As the Director of Product Management, I need to make sure I can unblock where I can to set others up for success while driving our revenue forward. At the same time, I try to stay sane while having 100 TODOs every day. Meetings are definitely a big portion of my day, but it is a very exciting role because it involves greenfield technical architecture, pricing and packaging, machine learning to general product management and more. To be exposed to a lot of different areas across the business makes this a very exciting role and business unit.&lt;/p&gt;
  4022.  
  4023. &lt;h2 id=&quot;what-influences-you-to-pursue-a-career-in-technology&quot;&gt;What influences you to pursue a career in technology?&lt;/h2&gt;
  4024.  
  4025. &lt;p&gt;This is a really weird one on my end. I grew up in a village with 300 people in the middle of nowhere in Germany. Our paths in the village are very set on going to school, apprenticeships in banks or insurance companies, then you stay at the bank or insurance company, you get married, have kids and that’s it, you never really look to go outside of these set boundaries. Even going to university was never really something I was being made aware of. I never knew I could be a Data Scientist or an Engineer. I knew I didn’t want to follow the path I was told to follow, but I wasn’t aware of what other options I had.&lt;/p&gt;
  4026.  
  4027. &lt;p&gt;Being a teenager was an interesting time (especially for my mom) since I was trying to find my path which resulted in some ‘rebellious’ behavior (based on village standards). I ended up becoming a tattoo artist for a big chunk of time, traveling a lot all over the world, living in different countries and cities but at 19 I had a life event and was sick for a year, and I realized I didn’t want to be the ‘rebel’ that people expected me to be. I stopped following my ‘self-fulfilled prophecy’ after reading a book about a woman that got stuck in a rut and then ‘life visited her’, she ended up moving to Dublin, Ireland to start over. I thought ‘I can move to Dublin, Ireland’.  I started applying to random jobs and ended up getting a contracting job for a vendor that works for Google to do Customer Service for Google Ads. My first job in tech.&lt;/p&gt;
  4028.  
  4029. &lt;p&gt;When I moved to Ireland, I didn’t know anyone, lived in a hostel, but just having this job seemed enough for me to go on. Starting at the company in customer services, only 6 months in I became a team lead and moved into operations. A year later, I had the opportunity to join NextRoll and something told me that I had to take that leap. NextRoll has been a defining part of my career and sometimes I still wonder how I got here. I barely made it through math in school, and now I’m reading machine learning papers about causal inference which just shows how life is never a straight line or predictable. But I really found my place here – in this weird non-linear way.&lt;/p&gt;
  4030.  
  4031. &lt;h2 id=&quot;what-is-it-like-to-be-a-woman-working-in-tech-for-you&quot;&gt;What is it like to be a woman working in Tech for you?&lt;/h2&gt;
  4032.  
  4033. &lt;p&gt;Being a woman in tech during a time where there’s a focus on having more women in tech has been encouraging. Concepts like advocates and allies becoming the norm, alongside technological changes in how we hire, or coding boot camps for women (and girls) all strive to help other women to go into technology, which is exciting. As a hiring manager, it’s my job to focus on growing a diverse team and make sure that my team – man or woman – have an equal voice. We value everyone’s opinion and ensure everyone is heard no matter the gender or how you identify.&lt;/p&gt;
  4034.  
  4035. &lt;p&gt;Besides hiring more diverse teams and ensuring belonging, it’s important to have someone you can look up to that you can see yourself in. I have always had people I looked up to, but oftentimes those people have been men. This shows that there’s still a lot of work to be done to grow the representation of women in tech and leadership.&lt;/p&gt;
  4036.  
  4037. &lt;p&gt;I am happy to see that in NextRoll Platform Service we built a more diverse team. There’s still a lot of progress to be made but in product management, we’re solely women, and on the business development side our lead is a woman as well. Meanwhile, on the engineering side, we still got ways to go.&lt;/p&gt;
  4038.  
  4039. &lt;p&gt;It’s been encouraging to be right at the point of change and to be able and empowered to influence that change so we can all pave a path for women in tech.&lt;/p&gt;
  4040.  
  4041. &lt;h2 id=&quot;is-there-one-piece-of-advice-you-wish-somebody-gave-you-at-the-beginning-of-your-career&quot;&gt;Is there one piece of advice you wish somebody gave you at the beginning of your career?&lt;/h2&gt;
  4042.  
  4043. &lt;p&gt;It’s very important to just go for it! This is my 7th job at the company in 5 ½ years. I started in sales development and ultimately ended up in product leadership but I never proactively pursued any of the roles I’ve held. I was fortunate enough to have had people that told me to go for it and believed in me when I didn’t. I still get moments of imposter syndrome where I think ‘I shouldn’t be here’. Without the people that invested in me and trusted me to do bigger things than what I have trusted myself to do, I don’t think I’d be where I am today. So, don’t be like me, you gotta be braver and just grab opportunities when they arise since having people nudge you or waiting for them to do so may never come around.&lt;/p&gt;
  4044.  
  4045. &lt;h2 id=&quot;who-has-been-your-biggest-advocatementor-in-the-workplace-and-why&quot;&gt;Who has been your biggest advocate/mentor in the workplace and why?&lt;/h2&gt;
  4046.  
  4047. &lt;p&gt;I’ve had many but if I had to pick one that made the biggest impact on my career, it’s &lt;a href=&quot;https://www.linkedin.com/in/dialtone/&quot;&gt;Valentino&lt;/a&gt; (our CTO). I don’t know if Valentino actually knows this but back when I was an account manager one year into the company, he set up a meeting with me. I was terrified wondering why he’d want to talk to me of all people. During our meeting, he walked me through a dashboard with cool performance stats giving me insight into our underlying infrastructure. He helped me realize I really liked looking at data and understanding the underlying technical pieces. He really pushed me, maybe even unintentionally, to care about the technology and infrastructure and brought me closer to product and engineering. He’s always been someone I could go to and ask questions and then come back with a really deep understanding of the answer. He’s also been a sounding board and great mentor during challenging times, or times where I was unsure of myself or skills. Lastly, he’s been a huge advocate for me throughout my career believing I can do more and trusting me to take bigger leaps. He took down the barriers I’ve built allowing me to be where I am today and I couldn’t be more grateful.&lt;/p&gt;
  4048.  
  4049. &lt;h2 id=&quot;who-are-your-role-models-in-women-in-tech&quot;&gt;Who are your role models in women in Tech?&lt;/h2&gt;
  4050.  
  4051. &lt;p&gt;I nerd out about Susan Athey. She did a lot of interesting research in economics and especially causal inference. I have a strange obsession with causal inference so I can geek out about her content for hours. Once I reached out to her asking whether she’d be willing to hop on a call and when she said yes, I nearly lost it.&lt;/p&gt;
  4052.  
  4053. &lt;p&gt;I also look to people I work with or observe in my day to day, how they bring themselves to work, and the impact they’re having. I look to people like &lt;a href=&quot;https://www.linkedin.com/in/jgrist/&quot;&gt;Jessica Grist&lt;/a&gt; who is someone that paves the path for other women in tech every day and I admire her drive and passion, she’s also just so smart!  Another person is &lt;a href=&quot;https://www.linkedin.com/in/prathibha-deshikachar-7435852/&quot;&gt;Prathibha&lt;/a&gt;, who no longer works here but left her mark to this day. Prathibha is a total badass. She always fought for things to be better than they were and held people accountable. She never accepted mediocrity and pushed everyone to do their best work while taking a no-nonsense approach. There are a lot more people I could mention since my immediate environment influences who I am as a person, co-worker, peer and leader every day.&lt;/p&gt;
  4054.  
  4055. &lt;h2 id=&quot;any-advice-for-women-in-techallies-you-would-give&quot;&gt;Any advice for women in tech/allies you would give?&lt;/h2&gt;
  4056.  
  4057. &lt;p&gt;Now is the time where things are finally shifting. It’s time for everyone to do their part to contribute whether it’s contribution time to areas like ‘&lt;a href=&quot;https://girlswhocode.com/&quot;&gt;Girls who code&lt;/a&gt;’ or talking to kids about technology/computer science and showing them that this is something they can pursue. It’s time to make it normal for girls and women to know what they have a place here in technology.&lt;/p&gt;
  4058.  
  4059. &lt;p&gt;Among women, we are honestly the hardest on one another. We are raised to be competitive and often bitter about other women’s achievements rather than celebrating them. I ask other women to celebrate women that take on leadership roles, support and help other people grow where they can instead of getting into a competitive mindset. Maybe the competitiveness among women in tech is even higher since there’s so few but only by uplifting each other, we can change that going forward.&lt;/p&gt;
  4060.  
  4061. &lt;p&gt;For allies, it’s the same thing – elevate people and enable them. If you need to kick someone in the butt for them to take a leap and you’re in a position to do so, do it. Don’t doubt yourself or think you’re not the right person to help them find their way, take the leap to lift someone up. Without the allies along my journey, I don’t know where I’d be today. I needed someone to push me out of my comfort zone, be that person to someone.&lt;/p&gt;
  4062. </description>
  4063.    </item>
  4064.    
  4065.    
  4066.    
  4067.    <item>
  4068.      <title>
  4069. Women in Tech Spotlight - Celebration of Women History Month - Elena Rose
  4070. </title>
  4071.      <link>https://tech.nextroll.com/blog/women_in_tech/2020/03/24/women-in-tech-spotlight-elena.html</link>
  4072.      <pubDate>Tue, 24 Mar 2020 00:00:00 -0700</pubDate>
  4073.      <author></author>
  4074.      <guid isPermaLink="false">https://tech.nextroll.com/blog/women_in_tech/2020/03/24/women-in-tech-spotlight-elena</guid>
  4075.      <description>&lt;p&gt;Women in Tech - Spotlight: Elena Rose&lt;/p&gt;
  4076.  
  4077. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;6 minute read&lt;/code&gt;&lt;/p&gt;
  4078.  
  4079. &lt;hr /&gt;
  4080. &lt;p&gt;March is Women’s History Month and in honor of this occasion, NextRoll is launching a month-long Women in Tech Spotlight – where we highlight a few of the women here at NextRoll. This will be a limited three-part series of interviews, showcasing three of our extraordinary women in technology. This week, I had the pleasure of interviewing Elena Rose!&lt;/p&gt;
  4081.  
  4082. &lt;h2 id=&quot;elena-rose&quot;&gt;Elena Rose&lt;/h2&gt;
  4083. &lt;h3 id=&quot;title-manager-data-science-engineer&quot;&gt;Title: Manager, Data Science Engineer&lt;/h3&gt;
  4084.  
  4085. &lt;p&gt;&lt;img src=&quot;/images/post_images/elena.png&quot; alt=&quot;open_sourced_projects&quot; /&gt;&lt;/p&gt;
  4086.  
  4087. &lt;h2 id=&quot;can-you-share-a-little-about-what-you-do-at-nextroll-and-what-your-typical-day-is-like&quot;&gt;Can you share a little about what you do at NextRoll and what your typical day is like?&lt;/h2&gt;
  4088.  
  4089. &lt;p&gt;I have migrated to a manager position a couple of years ago. My day changed quite a bit – before it was mostly coding, now it is primarily connecting teams together, understanding the strategic way to develop our product, cross-team dependencies, and try to compensate for that. This is where all where my brainpower goes to.&lt;/p&gt;
  4090.  
  4091. &lt;h2 id=&quot;what-do-you-like-most-about-your-tech-career&quot;&gt;What do you like most about your tech career?&lt;/h2&gt;
  4092.  
  4093. &lt;p&gt;I enjoy how tech people know what they want: they have the right expertise and a good definition of what they want to do. They have a clear path, are logical, and have structure. The types of people who pursue tech career – engineers joining engineering workforce – join for a reason. These are the people who like to poke around with different ideas. They are curious, and being around with these people is very interesting.&lt;/p&gt;
  4094.  
  4095. &lt;h2 id=&quot;what-would-you-say-was-the-reason-for-you-to-pursue-a-career-in-sciencetechnology&quot;&gt;What would you say was the reason for you to pursue a career in science/technology?&lt;/h2&gt;
  4096.  
  4097. &lt;p&gt;My father. He is an engineer. He created 50 patents in night visions. Growing up, I wanted to be an archeologist. I was looking into the history and participated in archaeology session during summer. Then my father told me that archeologists are people who find something, and then they spend time looking at that stuff. You can build them devices and make their workflow much more efficient, and I just swallowed that idea and decided to go into physics instead.&lt;/p&gt;
  4098.  
  4099. &lt;h2 id=&quot;is-there-one-piece-of-advice-you-wish-somebody-gave-you-at-the-beginning-of-your-career&quot;&gt;Is there one piece of advice you wish somebody gave you at the beginning of your career?&lt;/h2&gt;
  4100.  
  4101. &lt;p&gt;Don’t skip programming classes. Focus on programming more because it will pay off. Start working hard as early as possible. Every minute you spent developing your brain will pay off in the future. That is definitely what I’d like to hear.&lt;/p&gt;
  4102.  
  4103. &lt;h2 id=&quot;who-has-been-your-biggest-advocatementor-in-the-workplace-and-why&quot;&gt;Who has been your biggest advocate/mentor in the workplace, and why?&lt;/h2&gt;
  4104.  
  4105. &lt;p&gt;So I was not always in tech. I started as a physicist and migrated to finance, and I’ve grown emotionally and matured quite a bit during that period, and I had great mentors in the finance field. Working with the bankers was a real good time for me because right now it’s paying off: I now developed intuition to know how to tackle open-ended questions.&lt;/p&gt;
  4106.  
  4107. &lt;p&gt;Secondly, My husband is probably the biggest influence for me in tech. He started working at Google being one of the early engineers. Through his years he touches very successful companies at the beginning of the stages. He developed a very deep understanding of approaches, best practices, and focuses for me to grow in the early days that shaped the way that I focus. I benefited quite a bit from listening to him. When he starts talking, I have a pen and paper writing it down because it is something that he should be teaching/have starting training materials on.&lt;/p&gt;
  4108.  
  4109. &lt;h2 id=&quot;is-there-anything-that-nextroll-does-to-support-women-in-technology-that-other-companies-should-do-too&quot;&gt;Is there anything that NextRoll does to support women in technology that other companies should do too?&lt;/h2&gt;
  4110.  
  4111. &lt;p&gt;Historically, women are not frequently joining in engineering. Women think differently. Women pay attention to different things. What attracts women in positions needs to be pitched differently between men and women. Women are very cautious about which job opportunity she would pursue. If she does not fit in any of the job descriptions, she would not apply.&lt;/p&gt;
  4112.  
  4113. &lt;p&gt;Also at the workplace, there should be some set of mindful places for women to be comfortable among a purely men’s culture. We need to develop certain islands to be comfortable to think and iterate. This is hard actually. I am a woman, I see women around. I am trying to understand how to formalize this. It’s not purely just diversity and inclusion. It is certain details that make a change.&lt;/p&gt;
  4114.  
  4115. &lt;p&gt;Women in tech meetings here at NextRoll are really good. They are talking about formalizing what is good for the women, and how the group should act, and what is the beneficial way for women to express her idea.&lt;/p&gt;
  4116.  
  4117. &lt;h2 id=&quot;who-are-your-role-models-in-women-in-tech&quot;&gt;Who are your role models in women in Tech?&lt;/h2&gt;
  4118.  
  4119. &lt;p&gt;People who are forward-thinking are the most interesting and influential to me. I don’t have anybody among women that is my role model, but I have quite a few among men. Right now, I am quite fascinated with Elon Musk. He has an amazing ability to work, focus, extraordinary thinking, and also he is capable of encapsulating himself in his current life. I just read a book about his life, and it was amazing. I think we will say in the future that we lived in the Elon Musk era.&lt;/p&gt;
  4120.  
  4121. &lt;h2 id=&quot;any-advice-for-women-in-techallies-you-would-give&quot;&gt;Any advice for Women in Tech/Allies you would give?##&lt;/h2&gt;
  4122.  
  4123. &lt;p&gt;Througout my tech career, I don’t sense any difference between being a man or a woman. My class at the university was 130 people and only 8 women. At that point, I would prefer more women. But after that, I don’t think that I had any disadvantages during my life because I was a woman. I would say try to gather as many mentors as possible, no matter what the gender is. Because among men, you might get some interesting ideas that could enhance your ability. Try to catch attention to those that are most knowledgable, and have the most expertise in whatever field you want to pursue right now. Usually, those people don’t need you. You need them and spend time to make yourself interesting to them. It doesn’t matter who they are, but the right person and the right time would make tons of differences.&lt;/p&gt;
  4124. </description>
  4125.    </item>
  4126.    
  4127.    
  4128.    
  4129.    <item>
  4130.      <title>Reflections on Remote Work at NextRoll</title>
  4131.      <link>https://tech.nextroll.com/blog/culture/2020/03/17/remote-work.html</link>
  4132.      <pubDate>Tue, 17 Mar 2020 00:00:00 -0700</pubDate>
  4133.      <author></author>
  4134.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2020/03/17/remote-work</guid>
  4135.      <description>&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
  4136.  
  4137. &lt;p&gt;As COVID-19 continues to spread and cause anxiety, many &lt;a href=&quot;https://www.usnews.com/news/national-news/articles/2020-03-06/apple-employees-join-twitter-facebook-amazon-in-remote-work-as-coronavirus-spreads&quot;&gt;tech companies have begun
  4138. encouraging or mandating that employees work from home&lt;/a&gt;.
  4139. For the time being, NextRoll is among these companies. Fortunately, NextRoll engineering
  4140. has long had a remote-friendly work culture; I myself have been working remotely
  4141. for the last three years after two years at our San Francisco office. In this post,
  4142. I’ll share my thoughts on the benefits of remote work, and some tips from NextRoll’s
  4143. remote employees intended for anyone who has recently started working remotely.&lt;/p&gt;
  4144.  
  4145. &lt;h1 id=&quot;benefits-of-remote-work&quot;&gt;Benefits of Remote Work&lt;/h1&gt;
  4146.  
  4147. &lt;p&gt;From a personal perspective, remote work has greatly increased both my quality of life
  4148. and my productivity. When I worked in San Francisco, I had almost an hour-long commute
  4149. to work each way. Now, I have an extra two hours of every weekday that I can spend working,
  4150. relaxing at home, or playing basketball. The extra time I am able to allocate to my personal
  4151. life allows me to be less stressed and better rested, which makes me more focused and
  4152. productive when it’s time to work. In other words, one of the major efficiencies of remote
  4153. work is that remote employees have more time for both their personal and professional lives.&lt;/p&gt;
  4154.  
  4155. &lt;p&gt;From a broader perspective, the option of remote work has helped enable NextRoll to hire
  4156. and retain a large number of very talented engineers. Some of our brightest, most impactful,
  4157. and most experienced engineers work remotely. In some of their cases, the option to work
  4158. remotely was the key differentiator between a NextRoll job offer, and offers from other
  4159. companies with more rigid work policies. In other cases, like my own, the option to work
  4160. remotely allowed NextRoll to retain engineers who would otherwise have had to leave after
  4161. deciding to move. In these ways, remote work can be a huge win for companies willing to
  4162. accommodate flexible work arrangements.&lt;/p&gt;
  4163.  
  4164. &lt;h1 id=&quot;tips-for-remote-work&quot;&gt;Tips for Remote Work&lt;/h1&gt;
  4165.  
  4166. &lt;p&gt;If you find yourself working remotely, and are looking for some tips on how to make it work,
  4167. this section is for you. These tips and suggestions come from several of my fellow remote
  4168. NextRollers.&lt;/p&gt;
  4169.  
  4170. &lt;h2 id=&quot;logistical-tips&quot;&gt;Logistical Tips&lt;/h2&gt;
  4171.  
  4172. &lt;p&gt;For your remote working arrangement to succeed, the first step is setting up your home
  4173. workstation. Of course, a reliable internet connection (with VPN) is a must. Beyond that,
  4174. messaging services like Slack and teleconferencing solutions like Google Meet help bridge
  4175. the physical gap between the office and your home office. Be careful though, these tools
  4176. can quickly become distractions as Matt Burch, one of our Software Engineering Team Leads,
  4177. points out. He recommends disabling notifications at times to give yourself an opportunity
  4178. to focus on the task at hand.&lt;/p&gt;
  4179.  
  4180. &lt;p&gt;It is very important to give your workstation its own, dedicated space, away from wherever
  4181. you unwind. This helps keep your work from bleeding into your personal life, and vice versa.
  4182. It pays dividends to invest in your workstation, especially for long-term remote working
  4183. situations. I learned this the hard way in my first year working remotely after developing
  4184. pain in my elbow from bad ergonomics. For this reason, I recommended a dedicated desk with
  4185. an adjustable chair, and at least one external monitor. A wireless keyboard and mouse are also
  4186. well worth the investment. Several of my colleagues, including Senior Software Engineer Jose
  4187. Hernandez, suggest purchasing a nice pair of speakers as well; instrumental background music
  4188. seems to be popular among the remote folks at NextRoll.&lt;/p&gt;
  4189.  
  4190. &lt;p&gt;As far as office attire, there are differing schools of thought. For some people, like Senior
  4191. Software Engineer Danny Fowler and Senior Staff Engineer Brian Ecker, it’s best to keep a
  4192. consistent schedule as if you were at the office. In particular, this means getting dressed
  4193. as if you were going into the office. Others, such as Staff Engineer Tyler Brown, believe
  4194. in embracing the flexibility that remote work offers by working in sweatpants.&lt;/p&gt;
  4195.  
  4196. &lt;p&gt;Another logistical aspect unique to remote work is the need to over-communicate with teammates.
  4197. Effective communication takes a more conscious effort when you are remote. This communication
  4198. is particularly challenging when teammates are spread across the world, in multiple timezones.
  4199. In situations like this, Staff engineer Brujo Benavides recommends carefully thinking through
  4200. questions you may have for people in other timezones, and being patient because their responses
  4201. may take a few hours. The good news is sometimes this extra thought can help you answer your
  4202. own question.&lt;/p&gt;
  4203.  
  4204. &lt;h2 id=&quot;tips-for-well-being&quot;&gt;Tips For Well-Being&lt;/h2&gt;
  4205.  
  4206. &lt;p&gt;As important as having the right work setup is having the right mentality and approach.
  4207. When working remotely, it is very easy to feel isolated or overworked. Technical Talent
  4208. Partner Haley Scruggs suggests scheduling meetings with teammates for the sole purpose
  4209. of catching up personally. Senior Software Engineer Corey Shott says taking walks, and
  4210. breaks away from the computer are particularly important when working remotely. For example,
  4211. it’s best to eat meals away from the computer.&lt;/p&gt;
  4212.  
  4213. &lt;p&gt;Remote work provides a lot of flexibility in terms of when you work, which is both a perk
  4214. and a challenge. On one hand, it’s easy to start a load of laundry during the workday, and
  4215. on the other hand, it’s easy to send just one more slack message after dinner. For this
  4216. reason, setting boundaries is particularly important when you work from home. Security
  4217. Engineer Nicolas Valcárcel suggests separating home and work entirely. To make this work
  4218. in practice, Senior Staff Engineer Mike Watters recommends developing routines that start
  4219. and end the day to enforce a mental separation between home and work.&lt;/p&gt;
  4220.  
  4221. &lt;p&gt;I’ve found a workout to end the day is particularly effective for me in this regard. On that
  4222. note, regular exercise is extremely important, as Director of Software Engineering Paul Huff
  4223. told me, because it’s so easy to become sedentary when working remotely. Not only does
  4224. exercise make you physically healthier, it will make you feel better mentally too.&lt;/p&gt;
  4225.  
  4226. &lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
  4227.  
  4228. &lt;p&gt;Remote work creates unique challenges. If you are working remotely for the first time, it takes
  4229. some getting used to both in terms of logistics, and in terms of your approach to work. For
  4230. example, you may have to create a dedicated space for your workstation or find a way to set
  4231. boundaries between your home and work.&lt;/p&gt;
  4232.  
  4233. &lt;p&gt;Nonetheless, at NextRoll we’ve found that these challenges are outweighed by the significant
  4234. benefits of remote work. Flexible work arrangements give employees better quality of life
  4235. while often increasing, rather than sacrificing, productivity, and give employers an advantage
  4236. in recruiting and retaining talent. If working at a remote-friendly company sounds appealing
  4237. to you, please take a look at our &lt;a href=&quot;https://www.nextroll.com/careers&quot;&gt;careers page&lt;/a&gt;.&lt;/p&gt;
  4238. </description>
  4239.    </item>
  4240.    
  4241.    
  4242.    
  4243.    <item>
  4244.      <title>
  4245. Women in Tech Spotlight: Iva Ivanova
  4246. </title>
  4247.      <link>https://tech.nextroll.com/blog/tech,%20diversity,%20inclusion/2020/03/03/women-in-tech-spotlight.html</link>
  4248.      <pubDate>Tue, 03 Mar 2020 00:00:00 -0800</pubDate>
  4249.      <author></author>
  4250.      <guid isPermaLink="false">https://tech.nextroll.com/blog/tech,%20diversity,%20inclusion/2020/03/03/women-in-tech-spotlight</guid>
  4251.      <description>&lt;p&gt;Women in Tech - Spotlight: Iva Ivanova&lt;/p&gt;
  4252.  
  4253. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;5 minute read&lt;/code&gt;&lt;/p&gt;
  4254.  
  4255. &lt;hr /&gt;
  4256. &lt;p&gt;March is Women’s History Month and in honor of this occasion, NextRoll is launching a month-long Women in Tech Spotlight – where we highlight a few of the women here at NextRoll. This will be a limited three-part series of interviews, showcasing three of our extraordinary women in technology. This week, I had the pleasure of interviewing Iva Ivanova!&lt;/p&gt;
  4257.  
  4258. &lt;h2 id=&quot;iva-ivanova&quot;&gt;Iva Ivanova&lt;/h2&gt;
  4259. &lt;h3 id=&quot;title-data-scientist-ii&quot;&gt;Title: Data Scientist II&lt;/h3&gt;
  4260.  
  4261. &lt;p&gt;&lt;img src=&quot;/images/post_images/ivanova.jpg&quot; alt=&quot;open_sourced_projects&quot; /&gt;&lt;/p&gt;
  4262.  
  4263. &lt;h2 id=&quot;can-you-share-a-little-about-what-you-do-at-nextroll-and-what-your-typical-day-is-like&quot;&gt;Can you share a little about what you do at NextRoll and what your typical day is like?&lt;/h2&gt;
  4264.  
  4265. &lt;p&gt;I am a Data Scientist on the Analytics team.  My day starts with a strong cup of tea. I generally spend half my day meeting and connecting with product managers and engineers in person or over Slack where we brainstorm about experimentation design, feature launches, data infrastructure, reporting, forecasting and more. The other half of my day is usually spent executing on those ideas.&lt;/p&gt;
  4266.  
  4267. &lt;h2 id=&quot;what-do-you-like-most-about-your-tech-career&quot;&gt;What do you like most about your tech career?&lt;/h2&gt;
  4268.  
  4269. &lt;p&gt;I like problem solving with technology.  I also like the fact that I am always learning something new.  If you like troubleshooting, learning, and being challenged, this is the career for you.&lt;/p&gt;
  4270.  
  4271. &lt;h2 id=&quot;what-would-you-say-was-the-reason-for-you-to-pursue-a-career-in-technology&quot;&gt;What would you say was the reason for you to pursue a career in technology?&lt;/h2&gt;
  4272.  
  4273. &lt;p&gt;I was working as a Sales Finance Analyst when a friend floated the idea of doing a 3-month data science bootcamp; I signed up and was instantly hooked.  As an Analyst I was already pretty comfortable writing advanced SQL queries but had to learn Python in a matter of weeks as it was a prereq for the bootcamp. The bootcamp was my initial exposure to machine learning.
  4274. I was fascinated by the sheer power that Python and machine learning put at my fingertips.
  4275. Using Python, I was suddenly able to perform tasks that used to take me hours in Excel in a matter of seconds with one line of code.
  4276. About a year later, I applied and was accepted into a part-time M.S. in Business Analytics program while keeping my full-time role at NexRoll.&lt;/p&gt;
  4277.  
  4278. &lt;h2 id=&quot;what-is-it-like-to-be-a-woman-working-in-tech-for-you&quot;&gt;What is it like to be a woman working in tech for you?&lt;/h2&gt;
  4279.  
  4280. &lt;p&gt;While it may still feel like an uphill battle as women are still a minority in the field, there are thankfully folks that are working to make things better by mentoring and advocating for women in engineering. And those are the folks that inspire me and keep me going!&lt;/p&gt;
  4281.  
  4282. &lt;h2 id=&quot;is-there-one-piece-of-advice-you-wish-somebody-gave-you-at-the-beginning-of-your-career&quot;&gt;Is there one piece of advice you wish somebody gave you at the beginning of your career?&lt;/h2&gt;
  4283.  
  4284. &lt;p&gt;Buckle up, you are in for quite the ride! Haha. Just kidding, sort of. I would tell myself to be patient, enjoy the journey, and don’t stress over what others are doing. Some of us are meant to take a short way to fulfill our dream, others the long way and what matters is not how long it takes but that you are enjoying the process and find what you do to be fulfilling.&lt;/p&gt;
  4285.  
  4286. &lt;h2 id=&quot;who-has-been-your-biggest-advocatementor-in-the-workplace-and-why&quot;&gt;Who has been your biggest advocate/mentor in the workplace and why?&lt;/h2&gt;
  4287.  
  4288. &lt;p&gt;I have been fortunate to have several great mentors over the years that have helped me navigate the reality of being a woman in tech. I had one manager early on in my career that always steered me toward technical projects even though I was not in a technical role at the time because they recognized I was good at teaching myself technical skills.&lt;/p&gt;
  4289.  
  4290. &lt;p&gt;I think it is important to find good mentors early on in our careers because they are often able to see potential in us long before we are able to realize it and steer us in the right direction.&lt;/p&gt;
  4291.  
  4292. &lt;h2 id=&quot;is-there-anything-that-nextroll-does-to-support-women-in-technology-that-other-companies-should-do-too&quot;&gt;Is there anything that NextRoll does to support women in technology that other companies should do too?&lt;/h2&gt;
  4293.  
  4294. &lt;p&gt;I can’t speak much about other companies but NextRoll allowed me to transition from a career in Sales Finance to Analytics and supported my decision to complete a master’s degree while working full-time. Companies like NextRoll that truly strive to not just hire great people but help them grow are at an advantage when it comes to the talent war. That’s why the Owl is my favorite culture creature. That is the kind of support system I have found here.&lt;/p&gt;
  4295.  
  4296. &lt;h2 id=&quot;who-are-your-role-models-in-women-in-tech&quot;&gt;Who are your role models in women in Tech?&lt;/h2&gt;
  4297.  
  4298. &lt;p&gt;Marie Curie - she was the first woman to win the Nobel Prize not once but twice in two separate sciences -  Physics and Chemistry. How awesome is that?&lt;/p&gt;
  4299.  
  4300. &lt;h2 id=&quot;any-advice-for-women-in-techallies-you-would-give&quot;&gt;Any advice for Women in Tech/Allies you would give?##&lt;/h2&gt;
  4301.  
  4302. &lt;p&gt;Women in Tech:
  4303. For those early on in your career, have a growth mindset and keep your eyes on the prize. You are going to fail and it is ok, just don’t let that discourage you along the way.  But keep in mind that you have to put in the work as well.&lt;/p&gt;
  4304.  
  4305. &lt;p&gt;Allies:
  4306. For those further ahead looking to mentor and advocate for women in tech, start a conversation, welcome questions, listen, and offer advice when asked. A small effort on your part might bring a lot of value to someone’s life.&lt;/p&gt;
  4307. </description>
  4308.    </item>
  4309.    
  4310.    
  4311.    
  4312.    <item>
  4313.      <title>
  4314. The new code formatter for Erlang: rebar3 format
  4315. </title>
  4316.      <link>https://tech.nextroll.com/blog/dev/2020/02/25/erlang-rebar3-format.html</link>
  4317.      <pubDate>Tue, 25 Feb 2020 00:00:00 -0800</pubDate>
  4318.      <author></author>
  4319.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2020/02/25/erlang-rebar3-format</guid>
  4320.      <description>&lt;p&gt;In recent years, many language ecosystems have developed automatic code formatters to reduce the mental overhead of code readers and therefore to share code more easily. These tools work by ensuring that all code written in the same language looks the same. Some examples of these tools include &lt;a href=&quot;https://golang.org/cmd/gofmt/&quot;&gt;gofmt&lt;/a&gt; for Go or &lt;a href=&quot;https://hexdocs.pm/mix/master/Mix.Tasks.Format.html&quot;&gt;mix format&lt;/a&gt; for Elixir. The Erlang community was lacking a tool like this, so we created a rebar3 plugin just to automatically format code.&lt;/p&gt;
  4321.  
  4322. &lt;p&gt;In this article we’ll discuss the history of the Erlang parsing and formatting tools, the challenges of developing a formatter and the resulting tool that we created. Learn how you can use it and customize it to your needs.&lt;/p&gt;
  4323.  
  4324. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10-15 minute read&lt;/code&gt;&lt;/p&gt;
  4325.  
  4326. &lt;hr /&gt;
  4327.  
  4328. &lt;center&gt;
  4329.    &lt;img alt=&quot;Bugs Bunny - The Rabbit of Seville.&quot; src=&quot;/images/post_images/rabbitofseville.jpg&quot; /&gt;&lt;br /&gt;
  4330.    &lt;i&gt;Bugs Bunny - The Rabbit of Seville.&lt;/i&gt;
  4331. &lt;/center&gt;
  4332.  
  4333. &lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
  4334. &lt;p&gt;NextRoll’s RTB Team devotes quite a bit of our efforts to make our codebase as mature and maintainable as we can.
  4335. For our Erlang code, we started working on this task years ago. We trim dead code using &lt;a href=&quot;/blog/dev/2018/10/09/remove-erlang-dead-code-xref.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;xref&lt;/code&gt;&lt;/a&gt;. We remove discrepancies with the help of &lt;a href=&quot;/blog/dev/2019/02/19/erlang-dialyzer.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dialyzer&lt;/code&gt;&lt;/a&gt;. We make sure our code is well behaved using &lt;a href=&quot;https://github.com/okeuday/pest&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PEST&lt;/code&gt;&lt;/a&gt;. We let &lt;a href=&quot;https://hex.pm/packages/elvis&quot;&gt;Elvis&lt;/a&gt; find stylistic anomalies…&lt;/p&gt;
  4336.  
  4337. &lt;p&gt;But there was one tool that was missing: a code-formatter. We were using a code formatter for Go, Elixir, and Python. But there weren’t any (or barely any) for Erlang. So, we decided to use one of our &lt;a href=&quot;/blog/culture/2019/11/26/hackweek-at-nextroll.html&quot;&gt;HackWeeks&lt;/a&gt; to create one.&lt;/p&gt;
  4338.  
  4339. &lt;h2 id=&quot;a-bit-of-history&quot;&gt;A Bit of History&lt;/h2&gt;
  4340. &lt;h3 id=&quot;other-code-formatters&quot;&gt;Other Code Formatters&lt;/h3&gt;
  4341. &lt;p&gt;Code formatting is certainly nothing new, it’s been around for ages with several very interesting papers written about it. But recently, there’s been a tendency in all &lt;em&gt;modern&lt;/em&gt; languages to include one (and &lt;strong&gt;only one&lt;/strong&gt;) formatter, like &lt;a href=&quot;https://golang.org/cmd/gofmt/&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gofmt&lt;/code&gt;&lt;/a&gt; and, of course, the one that influenced us the most: &lt;a href=&quot;https://hexdocs.pm/mix/master/Mix.Tasks.Format.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix format&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
  4342.  
  4343. &lt;p&gt;We’re by no means experts in this area and therefore we wanted to rely on existing efforts as much as we could. We took inspiration from all of them, but we tried to use as many already written components as possible, and (as usual) there are a bunch of them already baked into OTP…&lt;/p&gt;
  4344.  
  4345. &lt;h3 id=&quot;parsing-and-formatting-erlang-code&quot;&gt;Parsing and Formatting Erlang Code&lt;/h3&gt;
  4346. &lt;p&gt;Our first initiative was to try to find existing tools to format Erlang code.&lt;/p&gt;
  4347.  
  4348. &lt;p&gt;We found a ready-to-use solution: &lt;a href=&quot;https://hex.pm/packages/rebar3_fmt&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 fmt&lt;/code&gt;&lt;/a&gt;. The problem is that, as it clearly states in its description, it requires &lt;em&gt;emacs&lt;/em&gt;, which is something most of us don’t use. But it pointed us to what is generally recognized as the &lt;em&gt;de facto standard&lt;/em&gt; for Erlang formatters: &lt;a href=&quot;https://erlang.org/doc/apps/tools/erlang_mode_chapter.html&quot;&gt;erlang-mode for Emacs&lt;/a&gt;. That’s what the OTP Team considers &lt;em&gt;the standard way of formatting Erlang code&lt;/em&gt; and what the other tools included in OTP are loosely based on.&lt;/p&gt;
  4349.  
  4350. &lt;p&gt;What are these tools you say? That was our follow up question as well! And these are the ones we found:&lt;/p&gt;
  4351.  
  4352. &lt;ul&gt;
  4353.  &lt;li&gt;&lt;a href=&quot;http://erlang.org/doc/man/erl_tidy.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_tidy&lt;/code&gt;&lt;/a&gt;: The closest thing to an automatic code formatter for Erlang. It uses many of the other modules in this list to parse and rewrite Erlang code. It’s a bit old and it has a bunch of well-known deficiencies, including but not limited to its lack of proper support for macros and comments in code.&lt;/li&gt;
  4354.  &lt;li&gt;&lt;a href=&quot;http://erlang.org/doc/man/erl_prettypr.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_prettypr&lt;/code&gt;&lt;/a&gt;: This is a pretty printer – it takes an AST (Abstract Syntax Tree) as an input and it prints it out in a &lt;em&gt;pretty&lt;/em&gt; way. It’s the &lt;em&gt;standard&lt;/em&gt; Erlang pretty printer. Our original intention was to use it, but its extensibility support is &lt;em&gt;complex&lt;/em&gt; and &lt;em&gt;poorly documented&lt;/em&gt; at best. So, like many others before us (e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;wrangler&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erlang-ls&lt;/code&gt;, etc.), we just copied it into our project and started from there.&lt;/li&gt;
  4355.  &lt;li&gt;&lt;a href=&quot;http://erlang.org/doc/man/epp.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epp&lt;/code&gt;&lt;/a&gt;: The other side of the story. This is the Erlang parser that comes with OTP. It also has some limitations, most importantly when it comes to macros and comments, since it’s intended to be used primarily by the compiler.&lt;/li&gt;
  4356.  &lt;li&gt;&lt;a href=&quot;http://erlang.org/doc/man/epp_dodger.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epp_dodger&lt;/code&gt;&lt;/a&gt;: A module explicitly created to work like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epp&lt;/code&gt; but bypassing macros and preprocessor directives. It’s also a bit buggy and it has limited support for extensibility. Luckily, &lt;a href=&quot;http://twitter.com/jfacorro&quot;&gt;Juan Facorro&lt;/a&gt; already copied it and improved it in &lt;a href=&quot;https://hex.pm/packages/katana_code&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;katana-code&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
  4357. &lt;/ul&gt;
  4358.  
  4359. &lt;p&gt;So, what we ended up using is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ktn_dodger&lt;/code&gt; (from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;katana-code&lt;/code&gt;) to parse the code and turn it into an AST and then our version(s) of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_prettypr&lt;/code&gt; (more on this below) to output the formatted code.&lt;/p&gt;
  4360.  
  4361. &lt;h3 id=&quot;choosing-the-right-format&quot;&gt;Choosing the Right Format&lt;/h3&gt;
  4362. &lt;p&gt;Once we had the tools, we developed our very first version of the formatter (you can still find it on hex.pm as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0.0.1&lt;/code&gt;). That version simply passed the provided Erlang code through the tools and generated the code formatted with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_prettypr&lt;/code&gt;.&lt;/p&gt;
  4363.  
  4364. &lt;p&gt;For a while, we considered that to be &lt;em&gt;the canonical formatting&lt;/em&gt; since, you know, it comes with OTP, right? But we (and many others like us) didn’t want our code formatted strictly as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_prettypr&lt;/code&gt; outputs it.&lt;/p&gt;
  4365.  
  4366. &lt;p&gt;So, we started adding configuration options to be able to adjust the formatter to &lt;em&gt;our&lt;/em&gt; tastes. After a while, though, we realized that what we wanted was not an extension of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_prettypr&lt;/code&gt;, it was &lt;em&gt;a different formatter&lt;/em&gt;. That’s when we moved from one formatter with lots of options, to a formatting &lt;strong&gt;behavior&lt;/strong&gt; with multiple implementations.&lt;/p&gt;
  4367.  
  4368. &lt;p&gt;Our favorite way of formatting Erlang code is now encoded in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;default_formatter&lt;/code&gt;, but we kept the OTP-approved way alive as the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;otp_formatter&lt;/code&gt;. And now you can define your own, too.&lt;/p&gt;
  4369.  
  4370. &lt;h3 id=&quot;were-not-alone&quot;&gt;We’re not Alone&lt;/h3&gt;
  4371. &lt;p&gt;While doing our research, we found that we’re not the only ones who saw the need for an Erlang formatter. As a matter of fact, several people are working on different code formatting tools for Erlang these days:&lt;/p&gt;
  4372. &lt;ul&gt;
  4373.  &lt;li&gt;As we mentioned before, if you’re an emacs user you already have a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3&lt;/code&gt; plugin that you can use: &lt;a href=&quot;https://hex.pm/packages/rebar3_fmt&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 fmt&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
  4374.  &lt;li&gt;A while back, &lt;a href=&quot;http://zalastax.github.io/&quot;&gt;Pierre Krafft&lt;/a&gt; tried to just improve &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erl_tidy&lt;/code&gt; and wrote &lt;a href=&quot;https://github.com/erlang/otp/pull/2451&quot;&gt;a PR&lt;/a&gt; for that.&lt;/li&gt;
  4375.  &lt;li&gt;In that thread, we learned that &lt;a href=&quot;https://michal.muskala.eu/&quot;&gt;Michał Muskała&lt;/a&gt; was also working on a formatter, probably called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erlfmt&lt;/code&gt;.&lt;/li&gt;
  4376.  &lt;li&gt;Later on, we found &lt;a href=&quot;https://elixirforum.com/t/steamroller-an-opinionated-erlang-code-formatter/27532&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;steamroller&lt;/code&gt;&lt;/a&gt;, by &lt;a href=&quot;https://oldreliable.tech/&quot;&gt;Daniel Tipping&lt;/a&gt;.&lt;/li&gt;
  4377. &lt;/ul&gt;
  4378.  
  4379. &lt;p&gt;Both &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;steamroller&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erlfmt&lt;/code&gt; are much more opinionated than our formatter, mostly because their authors aim at having consistent formatting across all Erlang codebases in the world, much like the goal of &lt;a href=&quot;https://hexdocs.pm/mix/master/Mix.Tasks.Format.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix format&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
  4380.  
  4381. &lt;p&gt;We see that as a great goal, but we have a fairly smaller one: What we want is consistent formatting &lt;strong&gt;within&lt;/strong&gt; all Erlang codebases in the world. In other words: We want all modules in &lt;em&gt;each&lt;/em&gt; project to be consistently formatted, even if they don’t share the same formatting rules with other projects.&lt;/p&gt;
  4382.  
  4383. &lt;p&gt;After all, at this point it’s hard to even define the &lt;em&gt;canonical formatting&lt;/em&gt; for all Erlang code in the wild. And if we achieve our goal and some developers format their code using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 format&lt;/code&gt; and you (another developer) can’t read their code because it is using a different formatter than the one you’re used to (e.g. they use a comma-first ROK-style formatter), all you need to do is switch an option in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt;, run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 format&lt;/code&gt; and magically… that ugly code now looks &lt;em&gt;your&lt;/em&gt; way.&lt;/p&gt;
  4384.  
  4385. &lt;p&gt;But enough with the history, let’s see what you can do with this tool.&lt;/p&gt;
  4386.  
  4387. &lt;hr /&gt;
  4388.  
  4389. &lt;h1 id=&quot;how-to-use-rebar3-format&quot;&gt;How to Use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 format&lt;/code&gt;&lt;/h1&gt;
  4390. &lt;h2 id=&quot;quick-start&quot;&gt;Quick Start&lt;/h2&gt;
  4391. &lt;p&gt;Just add this to your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; (either in your project or globally in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/.config/rebar3/rebar.config&lt;/code&gt;):&lt;/p&gt;
  4392.  
  4393. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plugins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rebar3_format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  4394.  
  4395. &lt;p&gt;Then run&lt;/p&gt;
  4396.  
  4397. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;rebar3 format&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  4398.  
  4399. &lt;p&gt;and enjoy.&lt;/p&gt;
  4400.  
  4401. &lt;h2 id=&quot;configuration&quot;&gt;Configuration&lt;/h2&gt;
  4402. &lt;p&gt;If you don’t really like the default formatting as-is, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 format&lt;/code&gt; can be configured using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;format&lt;/code&gt; section of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt;. There are three main options you can specify.&lt;/p&gt;
  4403.  
  4404. &lt;h3 id=&quot;what-to-format&quot;&gt;What to Format&lt;/h3&gt;
  4405. &lt;p&gt;To determine what files the formatter should format, you use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;files&lt;/code&gt; parameter:&lt;/p&gt;
  4406.  
  4407. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;test/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  4408.  
  4409. &lt;h3 id=&quot;what-formatter-to-use&quot;&gt;What Formatter to Use&lt;/h3&gt;
  4410. &lt;p&gt;Unless you specify otherwise, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 format&lt;/code&gt; will always use the &lt;em&gt;default formatter&lt;/em&gt; that’s baked into it. But if you want you can use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;otp_formatter&lt;/code&gt; or your one, like this:&lt;/p&gt;
  4411.  
  4412. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  4413.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;test/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
  4414.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otp_formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  4415. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  4416.  
  4417. &lt;h3 id=&quot;how-to-configure-the-formatter&quot;&gt;How to Configure the Formatter&lt;/h3&gt;
  4418. &lt;p&gt;Finally, you can also set up individual options for the formatter you want to use. For instance, for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;otp_formatter&lt;/code&gt; you can change &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;paper&lt;/code&gt; (i.e. the expected max width of the formatted code):&lt;/p&gt;
  4419.  
  4420. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  4421.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;test/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
  4422.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otp_formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  4423.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paper&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;150&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;
  4424. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  4425.  
  4426. &lt;p&gt;To find out the options for each provider, check out the docs that are available &lt;a href=&quot;https://github.com/AdRoll/rebar3_format/#configuration&quot;&gt;online&lt;/a&gt;.&lt;/p&gt;
  4427.  
  4428. &lt;h2 id=&quot;a-proposed-workflow&quot;&gt;A Proposed Workflow&lt;/h2&gt;
  4429. &lt;p&gt;Drawing from the Smalltalk formatter experience that I had (where the code was formatted &lt;em&gt;only&lt;/em&gt; when presenting it to the developer but not when stored in the image itself), I want to propose a workflow for teams where each member has its preferred style for code formatting.
  4430. The idea is to take advantage of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3&lt;/code&gt; profiles and write the following on your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar.config&lt;/code&gt; file:&lt;/p&gt;
  4431.  
  4432. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;c&quot;&gt;%% The canonical format used when pushing code to the central repository
  4433. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  4434.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;include/*.hrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;test/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
  4435.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default_formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  4436.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paper&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;
  4437. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;
  4438. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;profiles&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  4439.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;brujo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  4440.        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  4441.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;include/*.hrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;test/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
  4442.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rok_formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;% I prefer comma-first formatting
  4443. &lt;/span&gt;            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paper&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;
  4444.        &lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  4445.    &lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
  4446.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;miriam&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  4447.        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  4448.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;include/*.hrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;test/*.erl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
  4449.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default_formatter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  4450.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  4451.                &lt;span class=&quot;n&quot;&gt;inline_clause_bodies&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;% she doesn&apos;t like one-liners
  4452. &lt;/span&gt;                &lt;span class=&quot;n&quot;&gt;inline_items&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;all&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;% but she doesn&apos;t like long lists of items
  4453. &lt;/span&gt;            &lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;
  4454.        &lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  4455.    &lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  4456. &lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  4457.  
  4458. &lt;p&gt;Then whenever you’re about to work on something, follow this ritual:&lt;/p&gt;
  4459.  
  4460. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;git checkout master
  4461. git checkout &lt;span class=&quot;nt&quot;&gt;-b&lt;/span&gt; my-branch
  4462. rebar3 as brujo format
  4463. &lt;span class=&quot;c&quot;&gt;# I work on my code normally&lt;/span&gt;
  4464. &lt;span class=&quot;c&quot;&gt;# Run tests and what-not&lt;/span&gt;
  4465. &lt;span class=&quot;c&quot;&gt;# Until I&apos;m ready to commit&lt;/span&gt;
  4466. rebar3 format
  4467. git commit &lt;span class=&quot;nt&quot;&gt;-am&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Apply my changes&quot;&lt;/span&gt;
  4468. git push origin my-branch &lt;span class=&quot;nt&quot;&gt;--set-upstream&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  4469.  
  4470. &lt;p&gt;Miriam does the same but using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;as miriam&lt;/code&gt; instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;as brujo&lt;/code&gt;.&lt;/p&gt;
  4471.  
  4472. &lt;p&gt;That way each one of us can read code in the way we understand it better, write code exactly how we like to write it, etc. Then publish it in a consistent way that matches the style of the rest of the project.&lt;/p&gt;
  4473.  
  4474. &lt;h1 id=&quot;examples&quot;&gt;Examples&lt;/h1&gt;
  4475. &lt;p&gt;If you want to see what the formatter can do to your code, the best place to go is the sample project on &lt;a href=&quot;https://github.com/AdRoll/rebar3_format/tree/master/test_app&quot;&gt;the repo itself&lt;/a&gt;. All sorts of examples are there with as many edge cases as we could create and find. If you know of others, &lt;strong&gt;please&lt;/strong&gt; contribute by adding them there or writing issues so we can add them.&lt;/p&gt;
  4476.  
  4477. &lt;p&gt;Even though we’re still in the process of testing and improving our tool, we already started using it for several of our repositories. You can see the formatter in action in &lt;a href=&quot;https://github.com/AdRoll/spillway/pull/4&quot;&gt;spillway&lt;/a&gt; and &lt;a href=&quot;https://github.com/AdRoll/mero/pull/54&quot;&gt;mero&lt;/a&gt;.&lt;/p&gt;
  4478.  
  4479. &lt;p&gt;As the best code reviewers out there may notice, those PRs required some &lt;em&gt;manual adjustments&lt;/em&gt; after processing the code with the formatter. We consider that &lt;em&gt;a feature&lt;/em&gt;: That’s the formatter allowing us to spot very ugly pieces of code (e.g., too deeply nested structures) that we should refactor.&lt;/p&gt;
  4480.  
  4481. &lt;p&gt;To be clear: the formatter didn’t break our code (it has built-in verification for that), it just made it look &lt;strong&gt;extremely ugly&lt;/strong&gt;, therefore prompting us to beautify it.&lt;/p&gt;
  4482.  
  4483. &lt;center&gt;
  4484.    &lt;img title=&quot;Sorry, it&apos;s just too good!&quot; src=&quot;/images/post_images/rabbit-hair.gif&quot; /&gt;&lt;br /&gt;
  4485.    &lt;i&gt;Bugs Bunny - The Rabbit of Seville.&lt;/i&gt;
  4486. &lt;/center&gt;
  4487.  
  4488. &lt;hr /&gt;
  4489.  
  4490. &lt;h1 id=&quot;what-now&quot;&gt;What Now?&lt;/h1&gt;
  4491. &lt;p&gt;We’re releasing the formatter as early as we can to catch as many bugs and nuances as possible. Please try it in your code and report any bugs and new ideas you have &lt;a href=&quot;https://github.com/AdRoll/rebar3_format/issues&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  4492.  
  4493. &lt;p&gt;We plan to keep using this ourselves, but if anybody feels like making this tool &lt;em&gt;official&lt;/em&gt; (if you’re a member of the OTP Team, this message is &lt;strong&gt;100%&lt;/strong&gt; for you 🙄), that would be amazing.&lt;/p&gt;
  4494.  
  4495. &lt;hr /&gt;
  4496.  
  4497. &lt;hr /&gt;
  4498.  
  4499. &lt;h1 id=&quot;appendix-a-beautiful-code&quot;&gt;Appendix A: Beautiful Code&lt;/h1&gt;
  4500. &lt;p&gt;As a &lt;em&gt;bonus track&lt;/em&gt;, I wanted to know how does &lt;a href=&quot;https://medium.com/erlang-battleground/beautiful-code-254a5f8ef958?source=friends_link&amp;amp;sk=a7c610b2e13ed4aa72797832cd953ec6&quot;&gt;my favorite piece of code&lt;/a&gt; look like when formatted by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 format&lt;/code&gt;. Let’s see…&lt;/p&gt;
  4501.  
  4502. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  4503.  
  4504. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;author&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;john&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  4505. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;author&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paul&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  4506. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;author&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;george&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  4507. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;author&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ringo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  4508.  
  4509. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;export&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my_life&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  4510.  
  4511. &lt;span class=&quot;nf&quot;&gt;my_life&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;NewPlaces&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  4512.    &lt;span class=&quot;nv&quot;&gt;Places&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;get_all&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;places&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4513.    &lt;span class=&quot;nv&quot;&gt;UpdatedPlaces&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;NewPlace&lt;/span&gt;
  4514.                     &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;NewPlace&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;NewPlaces&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  4515.                        &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;member&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;NewPlace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Places&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;
  4516.    &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Place&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  4517.                          &lt;span class=&quot;nn&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;insert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;places&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Place&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  4518.                  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  4519.                  &lt;span class=&quot;nv&quot;&gt;UpdatedPlaces&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4520.    &lt;span class=&quot;nv&quot;&gt;DeletedPlaces&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Place&lt;/span&gt;
  4521.                     &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Place&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Places&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  4522.                        &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;member&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Place&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;NewPlaces&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;
  4523.    &lt;span class=&quot;nn&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;places&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;DeletedPlaces&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4524.    &lt;span class=&quot;nv&quot;&gt;Moments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Moment&lt;/span&gt;
  4525.               &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Place&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Places&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  4526.                  &lt;span class=&quot;nv&quot;&gt;Moment&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;places&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;moments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Place&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;
  4527.    &lt;span class=&quot;nv&quot;&gt;People&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Person&lt;/span&gt;
  4528.              &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Moment&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Moments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  4529.                 &lt;span class=&quot;nv&quot;&gt;Person&lt;/span&gt;
  4530.                     &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;moments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;lovers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Moment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
  4531.                          &lt;span class=&quot;nn&quot;&gt;moments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;friends&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Moment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;
  4532.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Dead&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Living&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;partition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;person&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_dead&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  4533.                                     &lt;span class=&quot;nv&quot;&gt;People&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4534.    &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;person&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;love&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Dead&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Living&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4535.  
  4536.    &lt;span class=&quot;nv&quot;&gt;You&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;get_first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;people&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4537.    &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Person&lt;/span&gt;
  4538.          &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Person&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;People&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  4539.             &lt;span class=&quot;nn&quot;&gt;person&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;comparable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Person&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;You&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;
  4540.    &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;love&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
  4541.    &lt;span class=&quot;nv&quot;&gt;UpdatedMemories&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;moments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;meaning&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Moment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  4542.                       &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Moment&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Moments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
  4543.    &lt;span class=&quot;nn&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;moments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;UpdatedMemories&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4544.  
  4545.    &lt;span class=&quot;nf&quot;&gt;my_life&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;You&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;People&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;UpdatedMemories&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  4546.  
  4547. &lt;span class=&quot;nf&quot;&gt;my_life&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;You&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;People&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Things&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  4548.    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;uniform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;of&lt;/span&gt;
  4549.      &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  4550.          &lt;span class=&quot;nn&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sleep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;uniform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4551.          &lt;span class=&quot;nn&quot;&gt;person&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;think_about&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;People&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  4552.      &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  4553.          &lt;span class=&quot;nn&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sleep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;uniform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4554.          &lt;span class=&quot;nn&quot;&gt;moments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;think_about&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Things&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  4555.      &lt;span class=&quot;p&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  4556.          &lt;span class=&quot;n&quot;&gt;dont_stop_now&lt;/span&gt;
  4557.    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  4558.    &lt;span class=&quot;nn&quot;&gt;person&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;love&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;You&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  4559.    &lt;span class=&quot;nf&quot;&gt;my_life&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;You&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;People&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Things&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  4560.  
  4561. &lt;p&gt;Not bad, huh?&lt;/p&gt;
  4562.  
  4563. &lt;center&gt;
  4564.    &lt;img title=&quot;Last time, I promise&quot; src=&quot;/images/post_images/rabbit-eyes.gif&quot; /&gt;&lt;br /&gt;
  4565.    &lt;i&gt;Bugs Bunny - The Rabbit of Seville.&lt;/i&gt;
  4566. &lt;/center&gt;
  4567.  
  4568. </description>
  4569.    </item>
  4570.    
  4571.    
  4572.    
  4573.    <item>
  4574.      <title>
  4575. How NextRoll encourages Engineering and Friends to take a week off to build amazing stuff!
  4576. </title>
  4577.      <link>https://tech.nextroll.com/blog/culture/2019/11/26/hackweek-at-nextroll.html</link>
  4578.      <pubDate>Tue, 26 Nov 2019 00:00:00 -0800</pubDate>
  4579.      <author></author>
  4580.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2019/11/26/hackweek-at-nextroll</guid>
  4581.      <description>&lt;p&gt;NextRoll Hack Week&lt;/p&gt;
  4582.  
  4583. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;5 minute read&lt;/code&gt;&lt;/p&gt;
  4584.  
  4585. &lt;hr /&gt;
  4586. &lt;p&gt;Hello! Alex here from NextRoll!&lt;/p&gt;
  4587.  
  4588. &lt;p&gt;Don’t you ever wish there was a time reserved to build any idea you had in mind? Maybe it’s some technical debt that you’re hoping to clean up, some new language you’re hoping to play with, or just collaborating with other engineers in the organization to build something fun.
  4589. Well…NextRoll has the solution for that…and it is HACK WEEK! Yes, we get a full week of hacking! &lt;em&gt;Twice&lt;/em&gt; a year!!&lt;/p&gt;
  4590.  
  4591. &lt;h1 id=&quot;what-is-hack-week&quot;&gt;What is Hack Week?!&lt;/h1&gt;
  4592. &lt;p&gt;Hack Week is a twice-yearly engineering event where engineers and friends can work on a project of their own choice.
  4593. There are events (like a kickoff, a wrap-up with demos, and a fabulous dinner for the winners) and some rules, but outside of that, what you work on is up to you!&lt;/p&gt;
  4594.  
  4595. &lt;p&gt;We encourage anyone to join Hack Week! People from Marketing, Operations, Account Management, and more have all participated in the past. Projects have addressed pain points, technical debt house-cleaning, and MVPs of new ideas. In the past, we’ve seen folks make a video game emulator, an MVP for the next NextRoll product - or people just take the opportunity to play with new frameworks, languages, ML models, and more!&lt;/p&gt;
  4596.  
  4597. &lt;p&gt;The event is also quite a logistical challenge! Coordinating Hack Week between our local Engineering group in San Francisco, remote engineers, and the Product organization takes multiple coordinators. Fortunately,  we have a team of Hack Week coordinators – with Jose Hernandez coordinating for Remote Engineering, Zachary Swetz coordinating from the product perspective, and our advisor Jessica Grist to help us ensure everything is done smoothly.&lt;/p&gt;
  4598.  
  4599. &lt;h1 id=&quot;what-do-i-get-from-participating-in-hack-week&quot;&gt;What do I get from participating in Hack Week?&lt;/h1&gt;
  4600. &lt;p&gt;When folks join Hack Week, they not only get the excitement of working with new people for a short amount of time and the ability to work on something they have wanted to work on, but they also get SWAG! Yes! We give out cool swag shirts like these in the summer:&lt;/p&gt;
  4601.  
  4602. &lt;p&gt;&lt;img src=&quot;/images/post_heroes/hackshirt.png&quot; alt=&quot;Hack shirt!&quot; /&gt;&lt;/p&gt;
  4603.  
  4604. &lt;p&gt;And for the winter we give out cool holiday swag like these:
  4605. &lt;img src=&quot;/images/post_images/hackweek_mugs.png&quot; alt=&quot;hackswag&quot; /&gt;&lt;/p&gt;
  4606.  
  4607. &lt;p&gt;Other swag we have given in the past includes winter socks, finger-less gloves, beer stein, etc.&lt;/p&gt;
  4608.  
  4609. &lt;h1 id=&quot;what-if-i-dont-know-what-to-build&quot;&gt;What if I don’t know what to build?&lt;/h1&gt;
  4610. &lt;p&gt;Need ideas to figure out what to build? We have solutions for that as well! Introducing Bounty Awards and Brainstorming session!&lt;/p&gt;
  4611.  
  4612. &lt;h2 id=&quot;bounty-awards&quot;&gt;Bounty Awards&lt;/h2&gt;
  4613. &lt;p&gt;Different organizations at NextRoll have different needs. They have their pain points, and also their own ideas of what they would like to see built during hack week. To encourage engineers and friends to work on the project, organizations can set a bounty price. They offer prices given the project that is built to solve their primary pain point.&lt;/p&gt;
  4614.  
  4615. &lt;p&gt;For instance, Marketing, inspired by our &lt;a href=&quot;https://www.nextroll.com/careers/culture&quot;&gt;culture creature&lt;/a&gt;, sponsored the “Most Customer-Centric Award” to encourage projects to directly benefit our customers.
  4616. &lt;img src=&quot;/images/post_images/dog.png&quot; alt=&quot;dog&quot; /&gt;&lt;/p&gt;
  4617.  
  4618. &lt;p&gt;Account Management sponsored the “Do More With Less Award” – also inspired by our culture creature Beaver.
  4619. &lt;img src=&quot;/images/post_images/beaver.png&quot; alt=&quot;beaver&quot; /&gt;&lt;/p&gt;
  4620.  
  4621. &lt;p&gt;These awards help provide ideas for what engineering and friends would like to build for Hack Week. Again, the decision to choose which project to build is really up to you!&lt;/p&gt;
  4622.  
  4623. &lt;h2 id=&quot;brainstorming-jam-sesh&quot;&gt;Brainstorming Jam Sesh!&lt;/h2&gt;
  4624.  
  4625. &lt;p&gt;A month before each event, we host two lunch meetings for Hack Week brainstorming – where people can come in and share ideas that they would like to build for Hack Week. The key here is freely let the ideas flow, so people can hear different proposals and build on top of one another. To allow more opportunites for folks to speak their mind, we split the local and remote engineers into smaller groups for brainstorming.&lt;/p&gt;
  4626.  
  4627. &lt;p&gt;After the groups have spent 20 minutes in the discussion, we then regroup for the last 10 min to share as a group. The goal is to get to hear as many ideas as we can.&lt;/p&gt;
  4628.  
  4629. &lt;p&gt;This is where crazy ideas get shared. For example:&lt;/p&gt;
  4630. &lt;ol&gt;
  4631.  &lt;li&gt;Food recommendation around the Mission District (where our office is at)&lt;/li&gt;
  4632.  &lt;li&gt;Bathroom playlist recommendation.&lt;/li&gt;
  4633.  &lt;li&gt;Free up our meeting rooms with some Google Calendar/Slack integration.&lt;/li&gt;
  4634. &lt;/ol&gt;
  4635.  
  4636. &lt;p&gt;Of course, not all of these ideas get to be implemented, but it’s impressively cool to see how creative people can get when they’re given the freedom to think outside the box.&lt;/p&gt;
  4637.  
  4638. &lt;h2 id=&quot;hack-week-begins&quot;&gt;Hack Week Begins!&lt;/h2&gt;
  4639.  
  4640. &lt;h3 id=&quot;kick-start&quot;&gt;Kick Start&lt;/h3&gt;
  4641. &lt;p&gt;Kick start is where people share the idea they would like to work on. It usually is a 1-minute description of what the project is, why it’s interesting, and a call for members to join the team’s Slack channel to participate in hacking.&lt;/p&gt;
  4642.  
  4643. &lt;p&gt;People usually have a 1 slider similar to this:
  4644. &lt;img src=&quot;/images/post_images/batchie-stats.png&quot; alt=&quot;slide&quot; /&gt;&lt;/p&gt;
  4645.  
  4646. &lt;p&gt;Again, it’s more about gaining interest for more people to participate in the project.&lt;/p&gt;
  4647.  
  4648. &lt;h3 id=&quot;readysethack&quot;&gt;Ready…set…HACK!&lt;/h3&gt;
  4649. &lt;p&gt;During hacking, we ask each team to create a Slack channel for their project, and we also ask hackers to make a daily stand up to be accountable for the project they work on.&lt;/p&gt;
  4650.  
  4651. &lt;h3 id=&quot;hack-demos-and-party&quot;&gt;Hack Demos and Party!&lt;/h3&gt;
  4652. &lt;p&gt;&lt;img src=&quot;/images/post_images/hw_jump.gif&quot; alt=&quot;Alt Text&quot; /&gt;&lt;/p&gt;
  4653.  
  4654. &lt;p&gt;Demo is where people showcase their creations! This is a time where we celebrate all the projects that are built. Everyone gets to share the stuff they have been working on. Along with the presentation, we also have wine, beer, and finger food to celebrate! We also have judges representing Product, Engineering, Marketing, and Remote to evaluate the winners.&lt;/p&gt;
  4655.  
  4656. &lt;p&gt;Cool Projects that came out along the way that we can share (since they’re open sourced!)…&lt;/p&gt;
  4657.  
  4658. &lt;p&gt;&lt;img src=&quot;/images/post_images/small_collage_hw.png&quot; alt=&quot;open_sourced_projects&quot; /&gt;&lt;/p&gt;
  4659.  
  4660. &lt;h4 id=&quot;erlang-formatter&quot;&gt;Erlang Formatter&lt;/h4&gt;
  4661. &lt;blockquote&gt;
  4662.  &lt;p&gt;A Rebar3 plugin for code formatting! (&lt;a href=&quot;https://github.com/AdRoll/rebar3_format&quot;&gt;repo here&lt;/a&gt;)&lt;/p&gt;
  4663. &lt;/blockquote&gt;
  4664.  
  4665. &lt;h4 id=&quot;batchie-patchie&quot;&gt;Batchie Patchie&lt;/h4&gt;
  4666. &lt;blockquote&gt;
  4667.  &lt;p&gt;A service built on top of AWS batch for jobs monitoring! (&lt;a href=&quot;https://github.com/AdRoll/batchiepatchie&quot;&gt;code here&lt;/a&gt; / Previous Blog about batchie patchie &lt;a href=&quot;http://tech.nextroll.com/blog/data/2018/08/08/running-jobs-with-aws-batch.html&quot;&gt;here&lt;/a&gt;)&lt;/p&gt;
  4668. &lt;/blockquote&gt;
  4669.  
  4670. &lt;h4 id=&quot;traildb&quot;&gt;TrailDB&lt;/h4&gt;
  4671. &lt;blockquote&gt;
  4672.  &lt;p&gt;An efficient way to store event based data (&lt;a href=&quot;https://github.com/traildb/traildb&quot;&gt;code here&lt;/a&gt;)&lt;/p&gt;
  4673. &lt;/blockquote&gt;
  4674.  
  4675. &lt;h4 id=&quot;python-hyperloglog&quot;&gt;Python HyperLogLog&lt;/h4&gt;
  4676. &lt;blockquote&gt;
  4677.  &lt;p&gt;Python implementation of hyperloglog (&lt;a href=&quot;https://github.com/AdRoll/python-hll&quot;&gt;code here&lt;/a&gt;)&lt;/p&gt;
  4678. &lt;/blockquote&gt;
  4679.  
  4680. &lt;h3 id=&quot;winners&quot;&gt;Winners&lt;/h3&gt;
  4681. &lt;p&gt;In the end, everyone is a winner! Everyone had fun building something awesome, and we all had wine and cheese and swag to celebrate it. To put an icing on top of the cake, we have our Hack Week judges get together to decide the winners. Prizes are given to the most product-driven project, the most technical-driven project, and the bounty prices!&lt;/p&gt;
  4682.  
  4683. &lt;h3 id=&quot;wrap-up&quot;&gt;Wrap up!&lt;/h3&gt;
  4684. &lt;p&gt;And that’s how we spend our time two weeks per year! We hope it provides a glimpse of the engineering life that is happening around in NextRoll!&lt;/p&gt;
  4685. </description>
  4686.    </item>
  4687.    
  4688.    
  4689.    
  4690.    <item>
  4691.      <title>
  4692. How NextRoll leverages AWS Batch for daily business operations
  4693. </title>
  4694.      <link>https://tech.nextroll.com/blog/dev/2019/11/19/aws-batch-at-nextroll.html</link>
  4695.      <pubDate>Tue, 19 Nov 2019 00:00:00 -0800</pubDate>
  4696.      <author></author>
  4697.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2019/11/19/aws-batch-at-nextroll</guid>
  4698.      <description>&lt;p&gt;Describing how NextRoll is handling processing petabytes of data using AWS Batch.&lt;/p&gt;
  4699.  
  4700. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;15 minute read&lt;/code&gt;&lt;/p&gt;
  4701.  
  4702. &lt;hr /&gt;
  4703. &lt;p&gt;In this blogpost, we are going to review how NextRoll is using AWS Batch for processing data, and how we benefit from this platform. We will start with an overview of the company and explain our needs for data pipelines. Then we’ll talk about how we are utilizing Batch, reviewing the advantages and disadvantages of this technology. Finally, we will show you Batchiepatchie, the product that we have introduced to overcome the challenges presented by Batch.&lt;/p&gt;
  4704.  
  4705. &lt;h1 id=&quot;about-nextroll&quot;&gt;About NextRoll&lt;/h1&gt;
  4706. &lt;p&gt;NextRoll is an active company in Digital Marketing and Advertisement. Currently, we are serving more than &lt;a href=&quot;https://www.adroll.com/why-adroll-digital-marketing-platform&quot;&gt;37,000 brands&lt;/a&gt;. These brands from all over the world have trusted us to run marketing campaigns for them.&lt;/p&gt;
  4707.  
  4708. &lt;p&gt;Our IntentMap features 1.2 billion shopper profiles and trillions of intent data points. Let me emphasize: It’s &lt;strong&gt;1.2 billion profiles&lt;/strong&gt; for different shoppers across the web. The number of profiles is growing every day.&lt;/p&gt;
  4709.  
  4710. &lt;p&gt;To keep up with the demand, our machine learning engine is doing 80,000,000,000 predictions every single day. Most machine learning systems would not do billions of predictions in their lifetime, but our system needs to scale up quickly to keep up with the demands, and that number above is not for busy days like Christmas or thanksgiving.&lt;/p&gt;
  4711.  
  4712. &lt;p&gt;To serve our customers, we are connected to 500 different supply sources across the internet. The sources provide a range of ways to reach customers through banner ads, social media, email, and onsite personalization.&lt;/p&gt;
  4713.  
  4714. &lt;p&gt;We have three different business units. Each business unit is targetting a different type of customers and
  4715. fulfilling different needs.&lt;/p&gt;
  4716.  
  4717. &lt;h1 id=&quot;why-batch&quot;&gt;Why Batch?&lt;/h1&gt;
  4718. &lt;p&gt;Given the numbers provided above, we can see that the data that is being generated by digital advertisement is a lot more than anyone can fit in their laptop. Unless you have a massive laptop with terabytes of disk space. That is the first reason that pushed us to look for cloud solutions to orchestrate our tasks.&lt;/p&gt;
  4719.  
  4720. &lt;p&gt;Over 2019 alone, Batch and Spot market usage has saved us &lt;strong&gt;$4,000,000&lt;/strong&gt;. That saving comes from comparing our model to the usage of EC2 on-demand instances to run our jobs. The saving would be even bigger if you consider the other alternative that is having the servers on-premise. Saving would be less if you compare Batch with EMR or other big data processing that also uses Spot markets, however, the cost would be still less.&lt;/p&gt;
  4721.  
  4722. &lt;p&gt;Besides cost saving, we have other reasons to use Batch. The freedom of stack is one of them. As long as you can package your requirements into a Dockerfile, and build that docker image, you can deploy it to Batch. This enables us to use a wide variety of technologies in our stack. We have jobs that are written using C/C++, Python, Rust, GoLang, Haskell, Java and other programming languages.&lt;/p&gt;
  4723.  
  4724. &lt;p&gt;Freedom in the way that data is being processed is another reason. To process this amount of data, we have
  4725. introduced some file formats, one of them is opensource: &lt;a href=&quot;https://github.com/traildb/traildb&quot;&gt;TrailDB&lt;/a&gt;. These
  4726. file formats do need special handling for creation and processing that we can handle using Batch. For comparison, Hadoop or Spark do enforce certain types and inputs.&lt;/p&gt;
  4727.  
  4728. &lt;p&gt;The last item is deployment. The packaged code lives in a docker image, and deployment is as easy as pulling the image from Elastic Container Registry. Besides, since the image is being pulled from ECR to Batch nodes, the whole process ends up being pretty fast.&lt;/p&gt;
  4729.  
  4730. &lt;p&gt;In a nutshell, these are our reasons to use Batch:&lt;/p&gt;
  4731. &lt;ol&gt;
  4732.  &lt;li&gt;Cost Saving&lt;/li&gt;
  4733.  &lt;li&gt;Scale of data&lt;/li&gt;
  4734.  &lt;li&gt;Freedom of stack&lt;/li&gt;
  4735.  &lt;li&gt;Custom data processing formats&lt;/li&gt;
  4736.  &lt;li&gt;Ease of deployment&lt;/li&gt;
  4737. &lt;/ol&gt;
  4738.  
  4739. &lt;h2 id=&quot;growth-of-batch-usage&quot;&gt;Growth of Batch usage&lt;/h2&gt;
  4740. &lt;p&gt;Since batch was introduced in 2017, we have been rapidly adapting and using it for more and more reasons.
  4741. In the table below, you can see a comparison of Batch usage between 2017 and 2019:&lt;/p&gt;
  4742.  
  4743. &lt;p&gt;&lt;img src=&quot;/images/post_images/batch_growth.png&quot; alt=&quot;Batch Growth&quot; /&gt;&lt;/p&gt;
  4744.  
  4745. &lt;p&gt;This shows how we grew about 10 times on each metric in only two years. How did this happen? First, we need to account for the growth of NextRoll during this time, but also those stats reveal a clear trend in the adoption of this technology across the company. Especially since teams are autonomous and they are free to use any tech they prefer to accomplish their goals, using Batch points how promising the results were. Additionally, it tells how batch can scale to our needs when we need to.&lt;/p&gt;
  4746.  
  4747. &lt;p&gt;In the chart below, you can see the number of jobs launched on different dates.
  4748. &lt;img src=&quot;/images/post_images/batch_date_count.png&quot; alt=&quot;Jobs Count Growth&quot; /&gt;&lt;/p&gt;
  4749.  
  4750. &lt;p&gt;We are constantly improving our pipelines, removing the jobs that are not needed with recent refactors or improving the job run times, and that is why you sometimes see a decrease in the number of jobs. But generally, the trend is upward.&lt;/p&gt;
  4751.  
  4752. &lt;h1 id=&quot;how-are-we-using-batch&quot;&gt;How are we using Batch?&lt;/h1&gt;
  4753. &lt;p&gt;We did publish a &lt;a href=&quot;http://tech.nextroll.com/blog/data/2018/08/08/running-jobs-with-aws-batch.html&quot;&gt;blog post&lt;/a&gt;
  4754. regarding how we are submitting jobs to the AWS Batch environment before. In this section, however, we are
  4755. going to have a more general view, and discuss the challenges on Batch as well.&lt;/p&gt;
  4756.  
  4757. &lt;h2 id=&quot;data-diagram&quot;&gt;Data Diagram&lt;/h2&gt;
  4758. &lt;p&gt;The data flow for Batch starts with Jenkins. After checking out the source code and building the Dockerfile, Jenkins builds the docker image and pushes it to ECR. ECR keeps all of the images that we need to run on Batch.&lt;/p&gt;
  4759.  
  4760. &lt;p&gt;&lt;img src=&quot;/images/post_images/batch_data_flow.png&quot; alt=&quot;Batch data flow&quot; /&gt;&lt;/p&gt;
  4761.  
  4762. &lt;p&gt;We have some containers running on ECR which control and organize our jobs on ECR. The first one is the scheduler, which kicks off the jobs based on the time of day. Luigi acts as a dependency checker and ensures the dependency of the jobs was met. It also checks if the job was run before. If it did not run before, and all the dependencies are there, the job would be submitted to Batch. We also have our software running on Batch to monitor the jobs, called &lt;a href=&quot;https://github.com/AdRoll/batchiepatchie&quot;&gt;BatchiePatchie&lt;/a&gt;. More about BatchiePatchie, below.&lt;/p&gt;
  4763.  
  4764. &lt;p&gt;When the job kicks off on Batch, the batch environment provisions an instance pulls the docker image from ECR and runs the command line in the job on it.&lt;/p&gt;
  4765.  
  4766. &lt;h2 id=&quot;sample-job-on-batch&quot;&gt;Sample job on Batch&lt;/h2&gt;
  4767. &lt;p&gt;Now that we have seen how we submit jobs on Batch, let’s see how one of our sample jobs works on Batch. One of the use cases for us is to convert data from CSV format to &lt;a href=&quot;https://github.com/traildb/traildb&quot;&gt;TrailDB&lt;/a&gt;. The job that we use for this purpose is called Baker. Baker converts data from one form to another and applies some filtering logic to it. It’s like a list comprehension but on a large scale.&lt;/p&gt;
  4768.  
  4769. &lt;p&gt;&lt;img src=&quot;/images/post_images/batch_baker_job.png&quot; alt=&quot;Batch job - Baker&quot; /&gt;&lt;/p&gt;
  4770.  
  4771. &lt;p&gt;In this example, we have a bunch of CSV files that need to be converted. First, the planner job runs on a specific time of day. The planner reviews the CSV files and the contents. Based on the amount of data and split size, it turns them into several splits. The number of splits would be different on different days and hours. For each split, a baker job is launched on Batch. The Baker jobs perform their conversion and save their output on S3.&lt;/p&gt;
  4772.  
  4773. &lt;p&gt;Next, the merge jobs run. These act as reduce functions and reduce data to a specific number of shards that are processed later on.&lt;/p&gt;
  4774.  
  4775. &lt;p&gt;In this example, you see that the data determines the scale of processing. If, for instance, we have only one CSV, one baker and one merger are enough. However, given the amount of our data, we have hundreds of Bakers running every day, and this map-reduce job is running on AWS Batch reliably in our pipeline all the time.&lt;/p&gt;
  4776.  
  4777. &lt;h2 id=&quot;purpose-of-the-jobs&quot;&gt;Purpose of the jobs&lt;/h2&gt;
  4778. &lt;p&gt;Going back to what has been mentioned before, we have at least 80B predictions every day, which cause 80B events.
  4779. Each event is about 1KB, so the prediction events alone are 80TB of information. Jobs that are processing this much
  4780. data are good candidates to run on Batch.&lt;/p&gt;
  4781.  
  4782. &lt;h3 id=&quot;attribution&quot;&gt;Attribution&lt;/h3&gt;
  4783. &lt;p&gt;Every day, we are registering about 500,000 conversions. In marketing, a conversion is an event that the customer cares about the most, which mostly is Sales. So you can read it as 500,000 sales. Anyways, for each conversion event, we need to look back 30 days to find out the event that caused that conversion. That requires us to process 30 times 80B which would be more than a few petabytes of information. And this is critically needed to provide reports to our customers.&lt;/p&gt;
  4784.  
  4785. &lt;h3 id=&quot;machine-learning&quot;&gt;Machine Learning&lt;/h3&gt;
  4786. &lt;p&gt;We mentioned that our AI system needs to make tens of billions of predictions every day. The model that powers the
  4787. predictions needs to be refreshed with fresh data periodically. The training for that model also happens on Batch. Batch, in my opinion,
  4788. is a great tool for training machine learning models. In our case, we have a single case of batch processing that happens once and then is
  4789. used several times. So models can be trained on Batch and served on a tiny web server.&lt;/p&gt;
  4790.  
  4791. &lt;p&gt;Generally speaking, Batch is a good use case for:&lt;/p&gt;
  4792. &lt;ul&gt;
  4793.  &lt;li&gt;Periodic or nightly pipelines&lt;/li&gt;
  4794.  &lt;li&gt;Automatic scaling processes&lt;/li&gt;
  4795.  &lt;li&gt;Tasks that are flexible in the stack&lt;/li&gt;
  4796.  &lt;li&gt;Tasks that can benefit from the use of different instance types&lt;/li&gt;
  4797.  &lt;li&gt;Orchestrating instances to run tasks&lt;/li&gt;
  4798. &lt;/ul&gt;
  4799.  
  4800. &lt;p&gt;On the topic of instance types, it should be noted that some of our
  4801. queues are configured to use “optimal” instance types, which means that they are provisioned with the tiniest instances possible. On the other hand, we have queues
  4802. that are using instances as big as x1.16xlarge to run heavy jobs. The freedom of choosing the right instance type for a job is
  4803. a great advantage for us.&lt;/p&gt;
  4804.  
  4805. &lt;p&gt;In a few words, we can say Batch is great for orchestration. The platform enables you to provision your tasks the way you
  4806. like on the platform you like with a very short delay.&lt;/p&gt;
  4807.  
  4808. &lt;h2 id=&quot;challenges&quot;&gt;Challenges&lt;/h2&gt;
  4809. &lt;p&gt;Using Batch was not free of troubles for us. Just like any other technology, we had some challenges that we needed to
  4810. overcome. We feel it is best to keep the conversation honest and mention the drawbacks of this technology as well.&lt;/p&gt;
  4811.  
  4812. &lt;h3 id=&quot;monitoring&quot;&gt;Monitoring&lt;/h3&gt;
  4813. &lt;p&gt;We submit millions of jobs and keeping track of them is a big challenge for us. Just finding the job that caused trouble
  4814. or searching the jobs to find out the status of a situation when things are not going as planned is not possible using
  4815. the AWS Batch console. That made us develop our tooling around it.&lt;/p&gt;
  4816.  
  4817. &lt;h3 id=&quot;out-of-instance-on-the-spot-market&quot;&gt;Out of instance on the spot market&lt;/h3&gt;
  4818. &lt;p&gt;Some of our queues are using certain instance types. It happened before that the instance type is not available all of a sudden. It is possible that other customers of AWS are provisioning those instance types or that certain issues are happening across the data centers. The reason is usually not transparent to us. Whatever the cause is, we would then need to find other available instance types. That hunt is not easy, and we were only able to solve it using try and error.&lt;/p&gt;
  4819.  
  4820. &lt;p&gt;The instance-type shortage does not happen often. Normally, we have cases like that once every three months or so. We
  4821. have disclosed the problem with the Spot team, and they said they are working on a product to resolve it.&lt;/p&gt;
  4822.  
  4823. &lt;h3 id=&quot;disk-cleanup&quot;&gt;Disk cleanup&lt;/h3&gt;
  4824. &lt;p&gt;Normally the life span of instances on Batch is very limited: The instances are provisioned, a docker
  4825. image is being deployed, a job would be launched, and the instance would be terminated right after. However, since we
  4826. send so many jobs to the queues, we have cases where instances would be alive for a long time serving tens of different.
  4827. kinds of jobs. If a job does not do efficiently clean up its temporary files, the next job may fail because of a disk shortage issue.&lt;/p&gt;
  4828.  
  4829. &lt;p&gt;The way we have worked around this problem was to bake our own AMIs with logic to cleanup aggressively for
  4830. dangling docker images and unused volumes. The cleanup would make sure the current job has enough disk space to run.
  4831. This simple change has resolved the problem for us.&lt;/p&gt;
  4832.  
  4833. &lt;h3 id=&quot;manual-work-for-mapreduce-jobs&quot;&gt;Manual work for Map/Reduce jobs&lt;/h3&gt;
  4834. &lt;p&gt;There are some frameworks for big data processing such as Spark and Hadoop. For machine learning, there are
  4835. frameworks for training like SageMaker. When using those tools, you only need to write the map/reduce logic and the
  4836. framework handles the boilerplate code to scale as needed. On Batch, however, the framework is only handling
  4837. the orchestration for you. That can be an advantage or a disadvantage. In some applications, that can be considered an
  4838. advantage, since you can be in control of every detail. But, sometimes having to focus so much on the details can be counterproductive.&lt;/p&gt;
  4839.  
  4840. &lt;p&gt;Some frameworks like Hadoop would provide a file system (i.e. HDFS) to process the big data. When using batch, you
  4841. can use S3 or Lustre to handle the shared storage.&lt;/p&gt;
  4842.  
  4843. &lt;h3 id=&quot;managed-queues&quot;&gt;Managed queues&lt;/h3&gt;
  4844. &lt;p&gt;Each queue on Batch can be Managed or Unmanaged. For unmanaged queues, you would need to handle the scaling yourself, which is very tricky. We do use Managed queues. For them, AWS handles the scaling of the underlying ECS cluster transparently. There is an issue, though. Batch sets the desired vCPUs for each queue, and in our experience, it takes a while for the queue to catch up with the provided load.&lt;/p&gt;
  4845.  
  4846. &lt;p&gt;To solve this problem, we have added some logic to BatchiePatchie to set the minimum vCPUs required to run all active jobs. This way the queue handles the scaling much faster.&lt;/p&gt;
  4847.  
  4848. &lt;h1 id=&quot;batchiepatchie&quot;&gt;Batchiepatchie&lt;/h1&gt;
  4849. &lt;p&gt;Since we started using Batch in 2017, we had some issues with it. During HackWeek, our engineers decided to take a shot at making life easier for themselves. They started working on a project to report the jobs in a way that makes more sense for us. Lorenzo Hernandez came up with the idea, and Mikko Juola suggested the name Batchiepatchie. It is a funny name, but it provides a serious service.&lt;/p&gt;
  4850.  
  4851. &lt;p&gt;Mikko spent another hack week to take Batchiepatchie to next level and made it opensource. It is now available on &lt;a href=&quot;https://github.com/AdRoll/batchiepatchie&quot;&gt;GitHub&lt;/a&gt; to download. In this section, we are going to review the main functionalities of this software.&lt;/p&gt;
  4852.  
  4853. &lt;h2 id=&quot;monitoring-1&quot;&gt;Monitoring&lt;/h2&gt;
  4854. &lt;p&gt;We already mentioned that one of our challenges with Batch is monitoring. Since our data pipeline consists of several jobs that modify and transform data in several ways, monitoring is a big challenge.&lt;/p&gt;
  4855.  
  4856. &lt;p&gt;The AWS Batch console provides some views. You can review the job queues and computation environment. Because of queue activity, it takes a while for the Batch dashboard to load, and when it loads, it can only show certain jobs, and in different views. Batchiepatchie, however, shows the granular status of the jobs in a single view. For instance, if you have submitted 30 jobs, 20 of them succeeded, 5 of them are running and 5 failed, you can see  all of them in a consolidated place.&lt;/p&gt;
  4857.  
  4858. &lt;p&gt;In the screenshot below, you can see both the AWS Batch console and batchiepatchie snapshot for the same job queue.&lt;/p&gt;
  4859.  
  4860. &lt;table&gt;
  4861.  &lt;tbody&gt;
  4862.    &lt;tr&gt;
  4863.      &lt;td&gt;&lt;img src=&quot;/images/post_images/aws_batch_1.png&quot; alt=&quot;Batch console 1&quot; /&gt;&lt;/td&gt;
  4864.      &lt;td&gt;&lt;img src=&quot;/images/post_images/bp_1.png&quot; alt=&quot;Batchiepatchie 1&quot; /&gt;&lt;/td&gt;
  4865.    &lt;/tr&gt;
  4866.  &lt;/tbody&gt;
  4867. &lt;/table&gt;
  4868.  
  4869. &lt;h2 id=&quot;logs&quot;&gt;Logs&lt;/h2&gt;
  4870. &lt;p&gt;Batch only recently (as in 2019) has added a feature that allows us to review the logs of jobs. Using a job-id, Batch console would provide a link to Cloudwatch where logs would be available. As useful as that feature is, it has several shortcomings:&lt;/p&gt;
  4871. &lt;ul&gt;
  4872.  &lt;li&gt;When scrolling up and down, you need to wait for logs to be loaded&lt;/li&gt;
  4873.  &lt;li&gt;In the log window, you do not have access to job details&lt;/li&gt;
  4874.  &lt;li&gt;Searching in the logs is not easy&lt;/li&gt;
  4875. &lt;/ul&gt;
  4876.  
  4877. &lt;p&gt;Batchiepatchie has been written by developers for developers. It provides all the logs generated by the process in one view which can be scrolled and searched easily. The exceptions and errors are also easily accessible.&lt;/p&gt;
  4878.  
  4879. &lt;p&gt;In the screenshots below, the views of the Batch console and Batchiepatchie logs view for comparison:&lt;/p&gt;
  4880.  
  4881. &lt;table&gt;
  4882.  &lt;tbody&gt;
  4883.    &lt;tr&gt;
  4884.      &lt;td&gt;&lt;img src=&quot;/images/post_images/aws_batch_2.png&quot; alt=&quot;Batch console 2&quot; /&gt;&lt;/td&gt;
  4885.      &lt;td&gt;&lt;img src=&quot;/images/post_images/bp_2.png&quot; alt=&quot;Batchiepatchie 2&quot; /&gt;&lt;/td&gt;
  4886.    &lt;/tr&gt;
  4887.  &lt;/tbody&gt;
  4888. &lt;/table&gt;
  4889.  
  4890. &lt;h2 id=&quot;search&quot;&gt;Search&lt;/h2&gt;
  4891. &lt;p&gt;Jobs can be searched by job-id in the AWS Batch console. If you happen to have job-id, you could access the job details and logs through the console. However, to find the job-id of the jobs that were submitted before, you need to go through some digging. Most of the time, we know the job name or the command line of the job that we are interested to check. That’s when Batchipatchie’s search feature is most useful.&lt;/p&gt;
  4892.  
  4893. &lt;p&gt;When you type a keyword in the search bar, Batchiepatchie would look through all the fields including job name and command line of the job to list the jobs that have been submitted before or that are running at the moment. Check the screenshots below for comparison.&lt;/p&gt;
  4894.  
  4895. &lt;table&gt;
  4896.  &lt;tbody&gt;
  4897.    &lt;tr&gt;
  4898.      &lt;td&gt;&lt;img src=&quot;/images/post_images/aws_batch_3.png&quot; alt=&quot;Batch console - 3&quot; /&gt;&lt;/td&gt;
  4899.      &lt;td&gt;&lt;img src=&quot;/images/post_images/bp_3.png&quot; alt=&quot;batchiepatchie - 3&quot; /&gt;&lt;/td&gt;
  4900.    &lt;/tr&gt;
  4901.  &lt;/tbody&gt;
  4902. &lt;/table&gt;
  4903.  
  4904. &lt;h2 id=&quot;holistic-view&quot;&gt;Holistic view&lt;/h2&gt;
  4905. &lt;p&gt;In the recent HackWeek, in 2019, Joey has added a functionality to show the flow of jobs in queues. In the stats view, the number of jobs that have certain properties has been counted and portraited. This is very useful to have a glance of the queue status and see what is happening overall.&lt;/p&gt;
  4906.  
  4907. &lt;p&gt;In the screenshot below, you can see the number of jobs that failed during the week. Just by looking at the chart, you can tell something was going on during October 16th. This could be helpful to catch silent failures or troubleshoot errors.&lt;/p&gt;
  4908.  
  4909. &lt;p&gt;&lt;img src=&quot;/images/post_images/bp_4.png&quot; alt=&quot;Batchiepatchie - 4&quot; /&gt;&lt;/p&gt;
  4910.  
  4911. &lt;h2 id=&quot;cost-analysis&quot;&gt;Cost Analysis&lt;/h2&gt;
  4912. &lt;p&gt;AWS Cost explorer would provide a lot of breakdowns for costs on AWS usage. By tagging different queues, you can query the cost for each queue separately. The cost of each job is not provided, however. To find out what is the most costly job in our day-to-day usage, we are using Batchiepatchie’s database.&lt;/p&gt;
  4913.  
  4914. &lt;p&gt;Batchiepatchie stores all status data for jobs and instances on a PostgreSQL database. The data that is not available in the UI could be queried on the database itself. Using queries, we can estimate the dollar cost of each job.&lt;/p&gt;
  4915.  
  4916. &lt;p&gt;A list of tables in that database has been listed below.&lt;/p&gt;
  4917.  
  4918. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;batchiepatchie=&amp;gt; \d+
  4919.                            List of relations
  4920.            Name              |   Type   |     Owner      |    Size  
  4921. ------------------------------+----------+----------------+-----------
  4922. activated_job_queues          | table    | batchiepatchie | 16 kB      
  4923. compute_environment_event_log | table    | batchiepatchie | 381 MB    
  4924. goose_db_version              | table    | batchiepatchie | 40 kB      
  4925. goose_db_version_id_seq       | sequence | batchiepatchie | 8192 bytes
  4926. instance_event_log            | table    | batchiepatchie | 28 GB      
  4927. instances                     | table    | batchiepatchie | 1537 MB    
  4928. job_status_events             | table    | batchiepatchie | 40 kB      
  4929. job_summary_event_log         | table    | batchiepatchie | 296 MB    
  4930. jobs                          | table    | batchiepatchie | 25 GB      
  4931. task_arns_to_instance_info    | table    | batchiepatchie | 2452 MB    
  4932. (10 rows)
  4933. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  4934.  
  4935. &lt;p&gt;If you use AWS Batch, you should check out Batchiepatchie! And if you use Batchiepatchie, please share your experience with us. We would appreciate any contribution to this repo and would love to hear from you.&lt;/p&gt;
  4936. </description>
  4937.    </item>
  4938.    
  4939.    
  4940.    
  4941.    <item>
  4942.      <title>
  4943. Interning at NextRoll
  4944. </title>
  4945.      <link>https://tech.nextroll.com/blog/culture/2019/10/15/interning-at-adroll.html</link>
  4946.      <pubDate>Tue, 15 Oct 2019 00:00:00 -0700</pubDate>
  4947.      <author></author>
  4948.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2019/10/15/interning-at-adroll</guid>
  4949.      <description>&lt;p&gt;Being a part of the team at NextRoll detailed through the eyes of an intern on the analytics team.&lt;/p&gt;
  4950.  
  4951. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;7 minute read&lt;/code&gt;&lt;/p&gt;
  4952.  
  4953. &lt;hr /&gt;
  4954.  
  4955. &lt;p&gt;‘Cali or Bust’. Anyone in the math faculty’s co-op program at the University of Waterloo can tell you what that means – wanting to land a sweet internship in California or nothing at all. I’m currently in my 4th co-op (out of 6), and since my first application period, I had become accustomed to the norm of dreaming about a term in California. Everyone else’s aspirations had somehow shaped my own and I started aiming towards achieving a job in the tech hub of the world.&lt;/p&gt;
  4956.  
  4957. &lt;p&gt;One of the things that’s great about the program I’m in is the frequent switch between school terms and work terms. When handing in countless assignments become your life’s purpose and biggest immediate stressor, looking forward to 4 or 8 months of working and a lack of anxiety-inducing exams helps students like myself pull through the difficulties of academic life. Similarly, when the work routine and waking up every day to commute to a 9 to 5 job might start to feel mundane, the independence of a self-regulated school schedule starts to feel appealing again.&lt;/p&gt;
  4958.  
  4959. &lt;p&gt;In May of this year, applications for the Fall co-op term began. I let myself be pickier than usual this term because I knew I didn’t want to solely apply to jobs based on their location, but also wanted to find job descriptions that resonated with my goals. Scattered throughout thousands of postings, were a few data driven roles that I was interested in. The posting for an analytics internship at a company called AdRoll (now NextRoll), felt like a potential good match. RTB? ‘What’s that?’ I wondered, looking it up to discover the interesting field of auction tech and real time bidding algorithms. ‘What’s a golden bagel?’ ‘Look at these cute culture creatures on their website.’ In particular, I liked the owl, which represented hiring great people and helping each other grow. Growth, on both individual and company levels, is hugely valued at NextRoll. As a student trying to jump into the workforce, encouragement and resources to expand my skillset are reserved as the most valuable things I can be provided with.&lt;/p&gt;
  4960.  
  4961. &lt;p&gt;&lt;img src=&quot;/images/post_images/owl.jpg&quot; alt=&quot;Culture owl&quot; /&gt;&lt;/p&gt;
  4962.  
  4963. &lt;p&gt;So, yeah, NextRoll seemed like a company with interesting work, a diverse and engaging culture, and also happened to be in San Francisco. Application submitted.&lt;/p&gt;
  4964.  
  4965. &lt;p&gt;Over the next few weeks of May into June, I juggled lectures and assignments and phone calls and technical challenges and interviews and info sessions. The more companies I interacted with, the more I knew I felt I would fit in best with NextRoll. The recruiters were all friendly and really cared about matchmaking the right interns to the right positions. They held a great social in Waterloo where I got to interact with employees from the team I had applied to as well as other teams. I began to stop caring about ‘Cali or Bust’ and start caring about focusing on doing good data science work and meeting inspiring people at this company.&lt;/p&gt;
  4966.  
  4967. &lt;p&gt;The interviewing period came to an end in June and I ended up receiving an offer, made an excited phone call to my dad, and started planning the co-op term with NextRoll.&lt;/p&gt;
  4968.  
  4969. &lt;p&gt;The first few weeks of the internship were full of training sessions for the NewRollers that joined NextRoll in September. The trainings introduced the new recruits to every team at the company, not just the one we would be working on, and these were great overviews to show us how the company functioned. We even got to have a discussion with the CEO, Toby Gabriner, which showed me that every employee at every level is seen as important. The diversity of the group I was in trainings with was obvious to me, faces of different ages and countries and experiences sitting next to me. Everyone was outgoing and intelligent and ready to contribute their talents towards growing this company. There were 4 other UW co-ops aside from me, which was cool because I haven’t closely worked with other UW students at my past 3 co-op jobs. These co-ops turned into some really great friends.&lt;/p&gt;
  4970.  
  4971. &lt;p&gt;&lt;img src=&quot;/images/post_images/interns-at-fair.JPG&quot; alt=&quot;The interns&quot; /&gt;&lt;/p&gt;
  4972.  
  4973. &lt;p&gt;NextRoll has a great office in Mission District. It’s a sunny, open space concept with lots of good snacks, company dinners and – OMG – a bunch of cute dogs. The hours are flexible; as long as you get your work done, you are entitled to a fairly independent schedule.&lt;/p&gt;
  4974.  
  4975. &lt;p&gt;&lt;img src=&quot;/images/post_images/sf-office.jpg&quot; alt=&quot;The office&quot; /&gt;&lt;/p&gt;
  4976.  
  4977. &lt;p&gt;My work involves answering questions or conducting analyses for the product or engineering teams. There is a lot of data to work with and a variety of tools at your disposal to get what you need to do done. Transitioning into new companies so often becomes a ritual, but setup and acclimating to different norms remains difficult in the beginning. Everyone on the analytics team I am on was kind enough to have a 1 on 1 chat with me to introduce themselves and the team from their perspective. Being a part of this collection of smart, accomplished, and yet down to earth people is inspiring. Every day, I have many questions, and rather than feel self-conscious about how much I feel I need to learn, I find myself being provided with encouragement and everlasting guidance. Experiencing this level of support motivates me to one day become a helpful and knowledgeable full-timer.&lt;/p&gt;
  4978.  
  4979. &lt;p&gt;I value being at NextRoll because of not just the good work, but the engaging company events, the community service program, inclusive groups like Women in Tech, and of course the super dog-friendly office. In my few months of being here I’ve been lucky enough to visit a good range of other companies headquartered in the area, and I can say that not every company has all of these things that I now look for.&lt;/p&gt;
  4980.  
  4981. &lt;p&gt;&lt;img src=&quot;/images/post_images/nr-gives-back.jpg&quot; alt=&quot;PHC&quot; /&gt;&lt;/p&gt;
  4982.  
  4983. &lt;p&gt;Working at a company in California shouldn’t be the goal.
  4984. Working at a company with values that align with yours should be the goal. Working at a company with a culture and job responsibilities that make you excited to go to work should be the goal. A company that emphasizes diversity, having resource groups and appreciates work-life balance should be the goal.&lt;/p&gt;
  4985.  
  4986. &lt;p&gt;To anyone in the search for a job, be open minded in finding a match that makes you feel fulfilled regardless of location. Sure, there may be a higher density of such companies in the state of California, but I think they exist everywhere. All my internships have been in different cities, each one its own amazing opportunity and learning experience. Don’t just ‘Cali or Bust’. Find your NextRoll.&lt;/p&gt;
  4987. </description>
  4988.    </item>
  4989.    
  4990.    
  4991.    
  4992.    <item>
  4993.      <title>
  4994. HyperLogLog in Python
  4995. </title>
  4996.      <link>https://tech.nextroll.com/blog/dev/2019/10/01/hll-in-python.html</link>
  4997.      <pubDate>Tue, 01 Oct 2019 00:00:00 -0700</pubDate>
  4998.      <author></author>
  4999.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2019/10/01/hll-in-python</guid>
  5000.      <description>&lt;p&gt;We open-sourced a Python library for HLLs compatible with postgresql-hll.&lt;/p&gt;
  5001.  
  5002. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;5 minute read&lt;/code&gt;&lt;/p&gt;
  5003.  
  5004. &lt;hr /&gt;
  5005. &lt;h2 id=&quot;what-is-python-hll&quot;&gt;What is python-hll?&lt;/h2&gt;
  5006.  
  5007. &lt;p&gt;We recently open-sourced &lt;a href=&quot;https://github.com/AdRoll/python-hll&quot;&gt;python-hll&lt;/a&gt;, which is an implementation of
  5008. &lt;a href=&quot;http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf&quot;&gt;HyperLogLog&lt;/a&gt; whose goal is to be
  5009. &lt;a href=&quot;https://github.com/aggregateknowledge/hll-storage-spec&quot;&gt;storage compatible&lt;/a&gt;
  5010. with &lt;a href=&quot;https://github.com/aggregateknowledge/java-hll&quot;&gt;java-hll&lt;/a&gt;, &lt;a href=&quot;https://github.com/aggregateknowledge/js-hll&quot;&gt;js-hll&lt;/a&gt;,
  5011. and &lt;a href=&quot;https://github.com/citusdata/postgresql-hll&quot;&gt;postgresql-hll&lt;/a&gt;. At NextRoll, we use these libraries to do fast
  5012. counting of unique values in PostgreSQL and Java, but we were dismayed that there was no Python library for it.
  5013. So we decided to make one.&lt;/p&gt;
  5014.  
  5015. &lt;h2 id=&quot;what-are-hlls&quot;&gt;What are HLLs?&lt;/h2&gt;
  5016.  
  5017. &lt;p&gt;In a &lt;a href=&quot;http://tech.nextroll.com/blog/dev/2019/08/06/hll-in-postgresql.html&quot;&gt;previous blog post&lt;/a&gt;, we talked about
  5018. how we use HLLs in PostgreSQL at NextRoll. As a quick recap, HLLs let you quickly count unique values. You
  5019. basically throw a bunch of unique values into a blob called an HLL. Then you can ask the HLL how many unique
  5020. values it has:&lt;/p&gt;
  5021.  
  5022. &lt;center&gt;
  5023. &lt;img alt=&quot;Adding values to an HLL&quot; src=&quot;/images/post_images/one-hll.png&quot; width=&quot;500&quot; /&gt;
  5024. &lt;/center&gt;
  5025.  
  5026. &lt;p&gt;Not only that, but you can also union HLL blobs together then ask it how many unique values it has:&lt;/p&gt;
  5027.  
  5028. &lt;center&gt;
  5029. &lt;img alt=&quot;Unioning two HLLs together&quot; src=&quot;/images/post_images/two-hlls.png&quot; width=&quot;500&quot; /&gt;
  5030. &lt;/center&gt;
  5031.  
  5032. &lt;p&gt;Here’s how it works using our Python library:&lt;/p&gt;
  5033.  
  5034. &lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;python_hll.hll&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HLL&lt;/span&gt;
  5035. &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;mmh3&lt;/span&gt;
  5036.  
  5037. &lt;span class=&quot;n&quot;&gt;hll1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HLL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5038. &lt;span class=&quot;n&quot;&gt;hll2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HLL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5039.  
  5040. &lt;span class=&quot;n&quot;&gt;hll1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_raw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmh3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;17&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5041. &lt;span class=&quot;n&quot;&gt;hll1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_raw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmh3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;17&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5042. &lt;span class=&quot;n&quot;&gt;hll1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_raw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmh3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;31&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5043. &lt;span class=&quot;n&quot;&gt;hll1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_raw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmh3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;5&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5044.  
  5045. &lt;span class=&quot;n&quot;&gt;cardinality&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hll1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cardinality&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  5046.  
  5047. &lt;span class=&quot;c1&quot;&gt;# Output: 3
  5048. &lt;/span&gt;
  5049. &lt;span class=&quot;n&quot;&gt;hll2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_raw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmh3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;6&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5050. &lt;span class=&quot;n&quot;&gt;hll2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_raw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmh3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;6&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5051. &lt;span class=&quot;n&quot;&gt;hll2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_raw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmh3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;31&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5052. &lt;span class=&quot;n&quot;&gt;hll2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_raw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmh3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;5&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5053.  
  5054. &lt;span class=&quot;n&quot;&gt;cardinality&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hll2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cardinality&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  5055.  
  5056. &lt;span class=&quot;c1&quot;&gt;# Output: 3
  5057. &lt;/span&gt;
  5058. &lt;span class=&quot;n&quot;&gt;hll1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;union&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hll2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5059.  
  5060. &lt;span class=&quot;n&quot;&gt;cardinality&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hll1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cardinality&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  5061.  
  5062. &lt;span class=&quot;c1&quot;&gt;# Output: 4
  5063. &lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  5064.  
  5065. &lt;p&gt;And then you can export the HLL to a string that works with postgresql-hll:&lt;/p&gt;
  5066.  
  5067. &lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;python_hll.util&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NumberUtil&lt;/span&gt;
  5068. &lt;span class=&quot;nb&quot;&gt;bytes&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hll1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_bytes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  5069. &lt;span class=&quot;n&quot;&gt;output&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;x&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NumberUtil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_hex&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;bytes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;bytes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5070.  
  5071. &lt;span class=&quot;c1&quot;&gt;# \x128D7F00000000201E4B390000000027FA7CC000000000531A35E4000000006E6F0B04
  5072. &lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  5073.  
  5074. &lt;h2 id=&quot;how-we-built-and-tested-it&quot;&gt;How We Built and Tested It&lt;/h2&gt;
  5075.  
  5076. &lt;p&gt;We built python-hll as part of NextRoll’s June 2019 Hack Week, a time when engineers take some time
  5077. to build things that interest them personally. As we only had a week, we thought it would be
  5078. expedient to do a fairly literal translation/port of &lt;a href=&quot;https://github.com/aggregateknowledge/java-hll&quot;&gt;java-hll&lt;/a&gt;
  5079. to Python.&lt;/p&gt;
  5080.  
  5081. &lt;p&gt;To minimize changes, we decided to &lt;a href=&quot;https://github.com/AdRoll/python-hll/blob/44349efa8ca35f3ce6a33260f5f270b0a655f921/python_hll/util.py#L107&quot;&gt;internally represent&lt;/a&gt;
  5082. bytes as Java-style bytes (-128 to 127) rather than true Python bytes (0 to 255). Issues we ran into
  5083. included: &lt;a href=&quot;https://github.com/AdRoll/python-hll/blob/44349efa8ca35f3ce6a33260f5f270b0a655f921/python_hll/util.py#L86&quot;&gt;how to translate Java’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/code&gt;&lt;/a&gt;,
  5084. into Python, &lt;a href=&quot;https://github.com/AdRoll/python-hll/blob/44349efa8ca35f3ce6a33260f5f270b0a655f921/python_hll/util.py#L130&quot;&gt;how to handle integer overflows from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;&amp;lt;&lt;/code&gt;&lt;/a&gt;
  5085. identically to Java so the tests pass (actually we were quite surprised that so many integer overflows were happening in
  5086. the Java unit tests), and how to make sure the tests pass in both Python 2.7 and Python 3.&lt;/p&gt;
  5087.  
  5088. &lt;p&gt;Fortunately, postgresql-hll has an extensive set of &lt;a href=&quot;https://github.com/citusdata/postgresql-hll/tree/master/sql/data&quot;&gt;test data&lt;/a&gt;
  5089. with 22,614 data points that we were able to use to &lt;a href=&quot;https://github.com/AdRoll/python-hll/tree/master/tests/data&quot;&gt;test our library&lt;/a&gt;.
  5090. In addition, we also ported the
  5091. &lt;a href=&quot;https://github.com/aggregateknowledge/java-hll/tree/master/src/test/java/net/agkn/hll&quot;&gt;unit tests&lt;/a&gt;
  5092. from java-hll &lt;a href=&quot;https://github.com/AdRoll/python-hll/tree/master/tests&quot;&gt;to Python&lt;/a&gt;.&lt;/p&gt;
  5093.  
  5094. &lt;p&gt;Unfortunately one drawback to our python-hll library is that it is quite slow compared to java-hll.
  5095. For example, in Java, &lt;a href=&quot;https://github.com/aggregateknowledge/java-hll/blob/master/src/test/java/net/agkn/hll/serialization/HLLSerializationTest.java&quot;&gt;HLLSerializationTest&lt;/a&gt;
  5096. takes 12 seconds to run, while in Python, the equivalent &lt;a href=&quot;https://github.com/AdRoll/python-hll/blob/master/tests/test_hll_serialization.py&quot;&gt;test_hll_serialization&lt;/a&gt;
  5097. takes 1.5 hours to run - it’s about 400x slower. It should be okay if a few HLL operations are needed,
  5098. but for many HLL operations you may find that it could slow your program down. Test it and see.&lt;/p&gt;
  5099.  
  5100. &lt;h2 id=&quot;acknowledgements&quot;&gt;Acknowledgements&lt;/h2&gt;
  5101.  
  5102. &lt;p&gt;Thanks to the 7 engineers who translated code and tests from Java to Python for this project at NextRoll’s June 2019 Hack Week:&lt;/p&gt;
  5103.  
  5104. &lt;ul&gt;
  5105.  &lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/jonathanaquino&quot;&gt;Jon Aquino&lt;/a&gt;&lt;/li&gt;
  5106.  &lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/kushagra391/&quot;&gt;Kushagra Verma&lt;/a&gt;&lt;/li&gt;
  5107.  &lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/alexcleu/&quot;&gt;Alex Leu&lt;/a&gt;&lt;/li&gt;
  5108.  &lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/michaeltran10/&quot;&gt;Michael Tran&lt;/a&gt;&lt;/li&gt;
  5109.  &lt;li&gt;&lt;a href=&quot;https://github.com/rodrigoadroll&quot;&gt;Rodrigo Westrupp&lt;/a&gt;&lt;/li&gt;
  5110.  &lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/subramaniansridharan/&quot;&gt;Sridharan Subramanian&lt;/a&gt;&lt;/li&gt;
  5111.  &lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/piyush-srivastava-5343a63/&quot;&gt;Piyush Srivastava&lt;/a&gt;&lt;/li&gt;
  5112. &lt;/ul&gt;
  5113. </description>
  5114.    </item>
  5115.    
  5116.    
  5117.    
  5118.    <item>
  5119.      <title>
  5120. Can Engineering principles solve sales problems?
  5121. </title>
  5122.      <link>https://tech.nextroll.com/blog/culture/2019/09/17/engineering-principles.html</link>
  5123.      <pubDate>Tue, 17 Sep 2019 00:00:00 -0700</pubDate>
  5124.      <author></author>
  5125.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2019/09/17/engineering-principles</guid>
  5126.      <description>&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;5-7 minute read&lt;/code&gt;&lt;/p&gt;
  5127.  
  5128. &lt;hr /&gt;
  5129.  
  5130. &lt;blockquote&gt;
  5131.  &lt;p&gt;“Onboarding customers is taking too long!”&lt;/p&gt;
  5132. &lt;/blockquote&gt;
  5133.  
  5134. &lt;blockquote&gt;
  5135.  &lt;p&gt;“Our customers are feeling frustrated!”&lt;/p&gt;
  5136. &lt;/blockquote&gt;
  5137.  
  5138. &lt;blockquote&gt;
  5139.  &lt;p&gt;“Our sellers are missing their quotas!”&lt;/p&gt;
  5140. &lt;/blockquote&gt;
  5141.  
  5142. &lt;blockquote&gt;
  5143.  &lt;p&gt;“Q4, our busiest season of the year, is right around the corner, and we need a solution ASAP!”&lt;/p&gt;
  5144. &lt;/blockquote&gt;
  5145.  
  5146. &lt;p&gt;All of the above complaints were funneled into the Solutions Engineering team at NextRoll. We were not providing support that met the standards expected from our team. In turn, our sales teams and our customers were suffering. If we did not solve some of these challenges quickly, we could’ve missed out on substantial revenue for Q4.&lt;/p&gt;
  5147.  
  5148. &lt;p&gt;The Solutions Engineering (SE) team at NextRoll is responsible for the technical aspects of onboarding clients, debugging post launch platform issues, and building tools to automate manual tasks. At our core, we sit at the junction between sales and engineering, communicating problems bi-directionally. We represent the voice of the client within engineering teams, but we also provide engineering expertise for sales teams.  When a bug impacting clients is discovered, the SE team conveys the impact and priority to the engineering team. When an SE is onboarding a client and is being pressured to rush, it is imperative that the SE advocates for what is best for the client, technically.&lt;/p&gt;
  5149.  
  5150. &lt;p&gt;When we first started learning about the problems our stakeholders –the sales teams– were experiencing with using our services, my response was, &lt;em&gt;“Oh! Of course the end-user doesn’t understand how to use our system!”&lt;/em&gt;. Patrick Mee, our EVP of Engineering, patiently waded through this initial defensiveness I felt about our team. He scheduled meetings with our stakeholders and continuously displayed real empathy to their concerns. Through these meetings, I was able to remove myself from the SE team and instead place myself in the end user’s perspective.&lt;/p&gt;
  5151.  
  5152. &lt;p&gt;After accepting that our end users had concerns, I could actually begin to listen to their complaints. Even though the heart of these issues were people and processes, the solution naturally came to me from thinking through this from an engineering lens. My background is in systems engineering, and the first step in building any system is to write down the use cases. How would an end user use our services as a system?&lt;/p&gt;
  5153.  
  5154. &lt;p&gt;The previous user story flowed like this:&lt;/p&gt;
  5155.  
  5156. &lt;ol&gt;
  5157.  &lt;li&gt;Customer has a question&lt;/li&gt;
  5158.  &lt;li&gt;Customer reaches out to their account manager/self service chat support representative (Tier 1)&lt;/li&gt;
  5159.  &lt;li&gt;Account manager/chat rep reaches out to Support Engineers (Tier 2)&lt;/li&gt;
  5160.  &lt;li&gt;Support Engineer escalates to Solutions Engineer (Tier 3) where needed&lt;/li&gt;
  5161.  &lt;li&gt;Solutions engineer contacts engineering if bug is found&lt;/li&gt;
  5162.  &lt;li&gt;Engineering provides resolution - bug fixed/won’t fix, potential workarounds, and any associated timelines&lt;/li&gt;
  5163.  &lt;li&gt;Solutions Engineer provide this feedback back to customers via account managers&lt;/li&gt;
  5164. &lt;/ol&gt;
  5165.  
  5166. &lt;h2 id=&quot;the-interfaces-are-the-weaknesses&quot;&gt;The interfaces are the weaknesses&lt;/h2&gt;
  5167. &lt;p&gt;The above user story shows a minimum of seven handoffs between teams to solve one issue. If you’ve ever played telephone, then you know information is lost as it transfers between parties. That’s because in any system, the interfaces are the weaknesses. The less handoffs there are between systems, the less information is lost between components. Here we had both component and interface issues. We started thinking about which handoffs to eliminate to reduce information transfers. We thought, “let’s reduce turnaround times for customers, making our sellers and customers happy at the same time”. Win-win, right?&lt;/p&gt;
  5168.  
  5169. &lt;p&gt;We first removed the barrier for support engineers to reach out to solutions engineers prior to reaching out to engineering teams. Due to some legacy baggage around not properly reaching out to the right channels, support engineers had been restricted from creating bug tickets for engineering as well as reaching out to those engineering teams in slack channels.&lt;/p&gt;
  5170.  
  5171. &lt;p&gt;Now, the user story flows like this:&lt;/p&gt;
  5172.  
  5173. &lt;ol&gt;
  5174.  &lt;li&gt;Customer has a question&lt;/li&gt;
  5175.  &lt;li&gt;Customer reaches out to their account manager/self service chat support representative&lt;/li&gt;
  5176.  &lt;li&gt;Account manager/chat rep reaches out to Support Engineers&lt;/li&gt;
  5177.  &lt;li&gt;Support engineer contacts engineering if bug is found&lt;/li&gt;
  5178.  &lt;li&gt;Engineering provides resolution - bug fixed/won’t fix, potential workarounds, and any associated timelines&lt;/li&gt;
  5179.  &lt;li&gt;Support Engineer provides this feedback back to customers via account managers&lt;/li&gt;
  5180. &lt;/ol&gt;
  5181.  
  5182. &lt;h2 id=&quot;perform-qa-on-individual-components-prior-to-integrating-components&quot;&gt;Perform QA on individual components prior to integrating components&lt;/h2&gt;
  5183.  
  5184. &lt;p&gt;Another bit of feedback we received was that Support Engineers were occasionally failing to provide sufficient information to the teams they were working with to onboarding clients. This led to multiple back and forths between SEs, Account Managers, and customers, introducing  potential churn risks at a critical phase in the client relationship. We needed to ensure that communications from our team were thorough, accurate, and identified any needed data immediately. To ensure every onboarding started on the right path, we needed to improve the quality of written information provided on the initial response in the ticket.&lt;/p&gt;
  5185.  
  5186. &lt;p&gt;We introduced code reviews. Code reviews, in engineering, are meant to provide a systematic way to improve code. In slack, we created peer review channels, pairing a support engineer with a solutions engineer. Solutions Engineers reviewed the onboarding instructions written by support engineers to ensure quality before they were sent to new clients. We noticed that the overall time to resolution was reduced significantly. By approaching our service as a system and performing Quality Assurance on the individual components prior to integration, we were able to have a smoother deploy to production.&lt;/p&gt;
  5187.  
  5188. &lt;h2 id=&quot;focus-on-the-right-metrics&quot;&gt;Focus on the right metrics&lt;/h2&gt;
  5189.  
  5190. &lt;p&gt;The support engineers were being held accountable to a 24 hour SLA(service level agreement). The support engineer needed to respond within 24 hours to a ticket, regardless of the quality of responses within the ticket. Unfortunately, time to response as an SLA was emphasized over other metrics, leading to lack of concern for other metrics such as number of iterations per ticket.&lt;/p&gt;
  5191.  
  5192. &lt;p&gt;Frankly, our end users were not concerned about receiving an incomplete response back within 24 hours. We shifted the focus from “time to response” to reducing the number of iterations between the requester and the SE on the ticket. We found that the higher the number of iterations on a ticket, the more likely end users were frustrated by customer service.&lt;/p&gt;
  5193.  
  5194. &lt;h2 id=&quot;gathering-feedback-post-changes&quot;&gt;Gathering Feedback Post Changes&lt;/h2&gt;
  5195.  
  5196. &lt;p&gt;The feedback we received from the sales teams after we implemented the changes truly honored the amazing work our Solutions Engineering teams do. The Support Engineering and Solutions Engineering teams at AdRoll work incredibly hard to improve the quality of service we provide to our customers. Hearing positive feedback from the same teams we provide service to makes us want to celebrate a great release, just like the engineers.&lt;/p&gt;
  5197.  
  5198. &lt;hr /&gt;
  5199. </description>
  5200.    </item>
  5201.    
  5202.    
  5203.    
  5204.    <item>
  5205.      <title>
  5206. HyperLogLog in PostgreSQL Amazon Aurora RDS
  5207. </title>
  5208.      <link>https://tech.nextroll.com/blog/dev/2019/08/06/hll-in-postgresql.html</link>
  5209.      <pubDate>Tue, 06 Aug 2019 00:00:00 -0700</pubDate>
  5210.      <author></author>
  5211.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2019/08/06/hll-in-postgresql</guid>
  5212.      <description>&lt;p&gt;We moved from HLLs stored in HBase to HLLs stored in Postgres with great results.&lt;/p&gt;
  5213.  
  5214. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;5-10 minute read&lt;/code&gt;&lt;/p&gt;
  5215.  
  5216. &lt;hr /&gt;
  5217. &lt;h2 id=&quot;what-is-hyperloglog-hll&quot;&gt;What is HyperLogLog (HLL)?&lt;/h2&gt;
  5218. &lt;p&gt;HyperLogLog (HLL) is a useful and interesting probabilistic data structure used to count unique values in a given data set with good accuracy and speed. Normally, to count unique values accurately requires memory proportional to the number of unique values. This becomes a problem when working with large data sets. HyperLogLog solves this problem by allowing us to trade memory consumption for (tunable) precision, making it possible to estimate cardinalities larger than 1 billion with a standard error of 2% using only 1.5 kilobytes of memory. For more on how this works, you can refer to our past blog post &lt;a href=&quot;http://tech.adroll.com/blog/data/2013/07/10/hll-minhash.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  5219.  
  5220. &lt;p&gt;At Adroll Group, we use this data structure in our reports to measure metrics like unique visitors, engaged visitors, etc. We had implemented HLL in-house and stored it in HBase. But HBase was causing us some headaches.&lt;/p&gt;
  5221.  
  5222. &lt;h2 id=&quot;problems-with-hbase&quot;&gt;Problems with HBase&lt;/h2&gt;
  5223. &lt;p&gt;While working with HBase we encountered two major challenges:&lt;/p&gt;
  5224. &lt;ol&gt;
  5225.  &lt;li&gt;Defining the key in a way that supports a variety of queries&lt;/li&gt;
  5226.  &lt;li&gt;Having to build aggregation operations in a Thrift layer.&lt;/li&gt;
  5227. &lt;/ol&gt;
  5228.  
  5229. &lt;h3 id=&quot;key-structure&quot;&gt;Key structure&lt;/h3&gt;
  5230. &lt;p&gt;HBase being a key-value data store requires a key which uniquely identifies the record. This key is used to extract metrics and aggregation in reporting. It was always a challenge to define the key in a way which makes searching of records optimized for different parameters, especially compared to a relational database.&lt;/p&gt;
  5231.  
  5232. &lt;p&gt;As an example, let’s say we are storing unique visitors for a customer with details about visitors every day.&lt;/p&gt;
  5233.  
  5234. &lt;p&gt;For that we can create a table called &lt;strong&gt;Daily Unique Visitors&lt;/strong&gt; with the following columns:&lt;/p&gt;
  5235.  
  5236. &lt;ul&gt;
  5237.  &lt;li&gt;Date&lt;/li&gt;
  5238.  &lt;li&gt;Customer ID&lt;/li&gt;
  5239.  &lt;li&gt;Browser Type&lt;/li&gt;
  5240.  &lt;li&gt;Location (Country)&lt;/li&gt;
  5241.  &lt;li&gt;Unique Visitors (HLL)&lt;/li&gt;
  5242. &lt;/ul&gt;
  5243.  
  5244. &lt;p&gt;Since the &lt;em&gt;value&lt;/em&gt; we care about is only the last column and the others are its key, in an HBase Table we’ll need to define a key-structure using the first 4 columns concatenated by underscores. Something like: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Date_Customer-id_browser-type_location&lt;/code&gt;&lt;/p&gt;
  5245.  
  5246. &lt;p&gt;A key will then look like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2019-07-01_A_Chrome_US&lt;/code&gt;, i.e. 1st July, for customer A, chrome browser, US location (Key) =&amp;gt; Unique visitors (Value)&lt;/p&gt;
  5247.  
  5248. &lt;p&gt;This works well when we need to scan it in the same order.
  5249. For example, to get the number of unique visitors in the month of July 2019, for a given customer, for the Chrome browser, for all locations.&lt;/p&gt;
  5250.  
  5251. &lt;p&gt;But as soon as we have different search criteria - like getting the number of unique visitors in July, for Customer A, for all browsers, in the  US - the key structure is no longer effective and requires more processing to discard irrelevant keys on the scan results.&lt;/p&gt;
  5252.  
  5253. &lt;h3 id=&quot;thrift-layer&quot;&gt;Thrift layer&lt;/h3&gt;
  5254. &lt;p&gt;For interaction with HBase from a non-Java platform, people usually use &lt;a href=&quot;https://blog.cloudera.com/blog/2013/09/how-to-use-the-hbase-thrift-interface-part-1/&quot;&gt;Thrift&lt;/a&gt; since it provides a language-independent interface. To support our reporting needs we have to implement operations like aggregation, filtering, and ordering of HLL records efficiently in our Thrift layer. This adds to development, maintenance, and resource costs. We also encountered stability issues with Thrift which required frequent restarts.&lt;/p&gt;
  5255.  
  5256. &lt;p&gt;In the earlier HBase table example: scanning, filtering, ordering, etc. is done in the Thrift layer.
  5257. Sometimes the date ranges are so big that Thrift makes lots of parallel HBase scan threads, causing resource crunches and frequent timeouts.&lt;/p&gt;
  5258.  
  5259. &lt;h2 id=&quot;hll-in-postgresql-amazon-aurora-rds&quot;&gt;HLL in PostgreSQL Amazon Aurora RDS&lt;/h2&gt;
  5260. &lt;p&gt;We started looking for alternatives and fortunately, Amazon Aurora RDS started supporting the &lt;a href=&quot;https://github.com/citusdata/postgresql-hll&quot;&gt;postgresql-hll&lt;/a&gt; extension to provide a new HLL data type. Switching to a relational database system solves a lot of our problems. If we use RDS as a data store, then we can define our schema efficiently using indexing, etc., to fulfill all our reporting needs (no more single keys). Also, we don’t have to maintain anything for our aggregation, filtering and ordering needs as RDS has that built in (no more Thrift).&lt;/p&gt;
  5261.  
  5262. &lt;p&gt;Going back to our daily unique visitors table example:
  5263. We can get results easily and efficiently using SQL:&lt;/p&gt;
  5264.  
  5265. &lt;div class=&quot;language-sql highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hll_cardinality&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hll_union_agg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique_visitors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5266. &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;daily_unique_visitors&lt;/span&gt;
  5267. &lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;2019-07-01&apos;&lt;/span&gt;
  5268.       &lt;span class=&quot;k&quot;&gt;AND&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;2019-08-01&apos;&lt;/span&gt;
  5269.       &lt;span class=&quot;k&quot;&gt;AND&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customer_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;A&apos;&lt;/span&gt;
  5270.       &lt;span class=&quot;k&quot;&gt;AND&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;location&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;US&apos;&lt;/span&gt;
  5271. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  5272.  
  5273. &lt;p&gt;Using various relational database optimization techniques like indexing, table partitioning etc. you can get results fast.&lt;/p&gt;
  5274.  
  5275. &lt;h3 id=&quot;proof-of-concept&quot;&gt;Proof-of-concept&lt;/h3&gt;
  5276. &lt;p&gt;To measure the improvements we did a proof-of-concept. We replicated 3 months of data in RDS and compared the HBase and RDS results.&lt;/p&gt;
  5277.  
  5278. &lt;h4 id=&quot;query-performance-improvements&quot;&gt;Query performance improvements&lt;/h4&gt;
  5279. &lt;p&gt;Long date range queries spanning 3 months used to take more than 60 seconds in HBase, but gave results in RDS in less than 10 seconds.&lt;/p&gt;
  5280.  
  5281. &lt;h4 id=&quot;less-management-effort&quot;&gt;Less management effort&lt;/h4&gt;
  5282. &lt;p&gt;HBase and the Thrift layer require maintenance such as, installation, monitoring, trouble shooting, development etc, which is already taken care of with Amazon Aurora since it’s a managed service.&lt;/p&gt;
  5283.  
  5284. &lt;p&gt;After evaluation, we decided to move to RDS, but it had its own challenges and limitations as explained below.&lt;/p&gt;
  5285.  
  5286. &lt;h2 id=&quot;challenges-in-migration&quot;&gt;Challenges in migration&lt;/h2&gt;
  5287. &lt;ol&gt;
  5288.  &lt;li&gt;The biggest challenge we encountered was our old data. Our HBase HLLs were not compatible with the new RDS HLL format due to the different hashing and conversion mechanisms for storage between our HBase implementation and the RDS extension. Where possible we rebuilt the data using RDS HLLs from raw source data. Where we were unable to do that, we generated HBase HLLs simultaneously with RDS HLLs on a go-forward basis. The day when RDS data was available was called the “cutoff date”. Whenever we had to query a time period requiring old data, before the cutoff date, we used the HBase HLLs. Data for date ranges after the cutoff date used RDS. After some time, we were able to retire our HBase queries completely.&lt;/li&gt;
  5289.  &lt;li&gt;Limited library support in the programming languages we use. The current PostgreSQL extension has library support in &lt;a href=&quot;https://github.com/aggregateknowledge/java-hll&quot;&gt;Java&lt;/a&gt; and &lt;a href=&quot;https://github.com/aggregateknowledge/js-hll&quot;&gt;Javascript&lt;/a&gt; only. This works fine if you are using one of these languages in your technology stack, but otherwise, it will be a limitation. We use Python a lot. Hence, we had to put some effort to port these libraries to Python. We tackled that project in our most recent HackWeek.&lt;/li&gt;
  5290. &lt;/ol&gt;
  5291.  
  5292. &lt;h2 id=&quot;python-hll-postgresql-hll-extension-python-library&quot;&gt;Python-hll: PostgreSQL-hll extension Python library&lt;/h2&gt;
  5293. &lt;p&gt;We created a Python library to read, write, count and do operations like the union of PostgreSQL compatible HLLs. This is similar to the &lt;a href=&quot;https://github.com/aggregateknowledge/java-hll&quot;&gt;Java library&lt;/a&gt;. We plan to open-source it soon – stay tuned.&lt;/p&gt;
  5294.  
  5295. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  5296. &lt;p&gt;Migrating from HBase to PostgreSQL HLL on Amazon Aurora was a big win for us. We found the PostgreSQL HLL extension a lot better to store our HLL reporting metrics. Also, the benefits that stem from using an RDS with support for HLL, like Amazon Aurora, serve our reporting needs to retrieve and present this information efficiently. We thoroughly recommend that you check it out.&lt;/p&gt;
  5297.  
  5298. &lt;hr /&gt;
  5299.  
  5300. </description>
  5301.    </item>
  5302.    
  5303.    
  5304.    
  5305.    <item>
  5306.      <title>
  5307. AdRoll Group and The Erlang Ecosystem Foundation
  5308. </title>
  5309.      <link>https://tech.nextroll.com/blog/culture/2019/07/31/erlang-ecosystem-foundation.html</link>
  5310.      <pubDate>Wed, 31 Jul 2019 00:00:00 -0700</pubDate>
  5311.      <author></author>
  5312.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2019/07/31/erlang-ecosystem-foundation</guid>
  5313.      <description>&lt;p&gt;We recently decided to sponsor the Erlang Ecosystem Foundation. We believe this is a very important initiative and we’re trying to contribute with it as much as we can. Learn more about it, the reasons behind our sponsorship, and the other ways in which we’re contributing in this article.&lt;/p&gt;
  5314.  
  5315. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;5-7 minute read&lt;/code&gt;&lt;/p&gt;
  5316.  
  5317. &lt;hr /&gt;
  5318.  
  5319. &lt;h1 id=&quot;whats-the-eef&quot;&gt;What’s The EEF?&lt;/h1&gt;
  5320. &lt;p&gt;Paraphrasing from &lt;a href=&quot;http://erlef.org/&quot;&gt;their own website&lt;/a&gt;…&lt;/p&gt;
  5321. &lt;blockquote&gt;
  5322.  &lt;p&gt;&lt;strong&gt;The Erlang Ecosystem Foundation&lt;/strong&gt; is a new non-profit organization dedicated to furthering the state of the art for &lt;a href=&quot;https://erlang.org/&quot;&gt;Erlang&lt;/a&gt;, &lt;a href=&quot;https://elixir-lang.org/&quot;&gt;Elixir&lt;/a&gt;, &lt;a href=&quot;http://lfe.io/&quot;&gt;LFE&lt;/a&gt;, and other technologies based on the BEAM. Their goal is to increase the adoption of this sophisticated platform among forward-thinking organizations. With member-supported Working Groups actively contributing to libraries, tools, and documentation used regularly by individuals and companies relying on the stability and versatility of the ecosystem, they actively invest in critical pieces of technical infrastructure to support their users in their efforts to build the next generation of advanced, reliable, realtime applications.&lt;/p&gt;
  5323. &lt;/blockquote&gt;
  5324.  
  5325. &lt;p&gt;The community behind Erlang, Elixir, etc. is not new and during its many years it has experienced multiple diverse phases and has grown with the help from many independent contributors. One of the goals behind the EEF is to unite these independent contributions, give them the support that they deserve and help everyone involved to get the best from them.&lt;/p&gt;
  5326.  
  5327. &lt;p&gt;These are the very first days of the Foundation, but they already have many working groups covering a variety of very intersting ideas:&lt;/p&gt;
  5328.  
  5329. &lt;ul&gt;
  5330.  &lt;li&gt;&lt;strong&gt;Building and Packaging:&lt;/strong&gt; Their mission is to evolve the tools in the ecosystem related to building and deploying code, with a strong focus on interoperability between BEAM languages&lt;/li&gt;
  5331.  &lt;li&gt;&lt;strong&gt;Observability:&lt;/strong&gt; Their goal is to evolve the tools in the ecosystem related to observability, such as metrics, distributed tracing and logging, with a strong focus on interoperability between BEAM languages.&lt;/li&gt;
  5332.  &lt;li&gt;&lt;strong&gt;Security:&lt;/strong&gt; They will work on identifying security issues, and providing solutions, developing guidance, standards, technical mechanisms and documentation.&lt;/li&gt;
  5333.  &lt;li&gt;&lt;strong&gt;Education, Training &amp;amp; adoption:&lt;/strong&gt; The objective of this working group is to facilitate, evolve education and training and consolidate educational material(s) for all BEAM languages and the BEAM itself.&lt;/li&gt;
  5334.  &lt;li&gt;&lt;strong&gt;Documentation:&lt;/strong&gt; They will work to improve the accessibility, interoperability and quality of the documentation across projects and languages in the Erlang Ecosystem.&lt;/li&gt;
  5335.  &lt;li&gt;&lt;strong&gt;Fellowship:&lt;/strong&gt; This is a group created to formally nominate community members for a fellowship role according to the Erlang Ecosystem Foundation bylaws.&lt;/li&gt;
  5336.  &lt;li&gt;&lt;strong&gt;Sponsorship:&lt;/strong&gt; This group function is to approve sponsors candidacy.&lt;/li&gt;
  5337.  &lt;li&gt;&lt;strong&gt;Marketing:&lt;/strong&gt; To expand awareness of Erlang Ecosystem and participation in its community. To promote the Erlang Ecosystem Foundation and its activities, and to increase engagement in the foundation.&lt;/li&gt;
  5338. &lt;/ul&gt;
  5339.  
  5340. &lt;h1 id=&quot;why-do-we-care&quot;&gt;Why do we care?&lt;/h1&gt;
  5341. &lt;p&gt;As you might have seen in some previous entries in this blog, some of our core systems (like our &lt;a href=&quot;/blog/web/2014/04/29/valentino-presents-adrolls-rtb-infrastructure.html&quot;&gt;real-time bidding platform&lt;/a&gt;) are fully powered by Erlang. And while we have a team of amazing developers, we don’t build everything from scratch. We invest deeply and depend heavily on the open-source libraries and tools that power up our infrastructure. Maintaining and improving those projects is a big challenge for their contributors, with many of them not being backed up by any company big or small. We believe that the EEF can help on the area a lot.&lt;/p&gt;
  5342.  
  5343. &lt;p&gt;On a related subject, we know that these languages even when not esoteric are not as popular as many others. Keeping a reasonable offer of developers that are capable of and willing to satisfy the increasing demand is not easy. We believe that the foundation will have a profound impact on this area.
  5344. We are also always looking for our devs to grow professionally. With the help of the different working groups within the EEF, we believe that this will be much easier. The Foundation will allow our devs to learn new skills, use better tools, standarize their best practices, contribute to the community and more.&lt;/p&gt;
  5345.  
  5346. &lt;p&gt;In general, we believe that the EEF will bring a very positive influx of energy to an already energetic and growing ecosystem of which we’re part of.&lt;/p&gt;
  5347.  
  5348. &lt;h1 id=&quot;contributing-with-the-eef&quot;&gt;Contributing with the EEF&lt;/h1&gt;
  5349. &lt;p&gt;AdRoll is a founding member of the EEF, we were there supporting it since its inception through the Erlang Industrial User Group.&lt;/p&gt;
  5350.  
  5351. &lt;p&gt;&lt;a href=&quot;https://twitter.com/miriampena&quot;&gt;Miriam Pena&lt;/a&gt;, staff engineer at AdRoll, helped define the bylaws and working groups, worked on the paperwork, marketing, presented the foundation in diverse places, and much more. Miriam is at this stage a board member for the EEF and she is also participating in the sponsorship, marketing, fellowship and education working groups.&lt;/p&gt;
  5352.  
  5353. &lt;p&gt;Besides her, some of our most talented engineers are also members of various working groups. For instance, &lt;a href=&quot;https://about.me/elbrujohalcon&quot;&gt;Brujo Benavides&lt;/a&gt; is a member of the &lt;a href=&quot;https://erlef.org/education-training-adoption/&quot;&gt;Education, Training &amp;amp; Adoption&lt;/a&gt; group.&lt;/p&gt;
  5354.  
  5355. &lt;p&gt;The EEF has a fairly simple membership program that includes 3 options:&lt;/p&gt;
  5356.  
  5357. &lt;ul&gt;
  5358.  &lt;li&gt;&lt;strong&gt;Basic Membership:&lt;/strong&gt; You’re entitled to attend meetings but not to vote on any resolutions.&lt;/li&gt;
  5359.  &lt;li&gt;&lt;strong&gt;Annual Supporting Membership:&lt;/strong&gt; Supporting Members may attend meetings and cast votes on resolutions. Your commitment also supports Foundation activities, such as Working Groups and community events.&lt;/li&gt;
  5360.  &lt;li&gt;&lt;strong&gt;Lifetime Supporting Membership:&lt;/strong&gt; All the benefits of an Annual Supporting Membership at a reduced rate!  This level demonstrates your alignment with our vision of the Erlang Ecosystem as the primary technology of the future.&lt;/li&gt;
  5361. &lt;/ul&gt;
  5362.  
  5363. &lt;p&gt;You can become a member by signing in at &lt;a href=&quot;https://members.erlef.org/join-us&quot;&gt;their website&lt;/a&gt;.&lt;/p&gt;
  5364.  
  5365. &lt;p&gt;This initiative can be an amazing thing for all the people that love these languages and this wonderful community. But that will only happen if we all take part of it.&lt;/p&gt;
  5366.  
  5367. &lt;p&gt;So, what are you waiting for? &lt;a href=&quot;https://members.erlef.org/join-us&quot;&gt;&lt;strong&gt;Become a member today!&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;
  5368.  
  5369. &lt;hr /&gt;
  5370.  
  5371. </description>
  5372.    </item>
  5373.    
  5374.    
  5375.    
  5376.    <item>
  5377.      <title>Series: Tech Women of AdRoll Group</title>
  5378.      <link>https://tech.nextroll.com/blog/culture/2019/03/08/tech-women-of-adroll-group-part-4.html</link>
  5379.      <pubDate>Fri, 08 Mar 2019 00:00:00 -0800</pubDate>
  5380.      <author></author>
  5381.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2019/03/08/tech-women-of-adroll-group-part-4</guid>
  5382.      <description>&lt;p&gt;In celebration of International Women’s Day 2019 and National Women’s History Month, we’re continuing our tradition of highlighting Tech Women at AdRoll Group.&lt;/p&gt;
  5383.  
  5384. &lt;p&gt;I love reading biographies, but not for the reason you might be thinking. You see, I love origin stories. No two people take the same path through life and it’s those different experiences that shape how we approach our work. That’s why I jumped at the chance to interview several members of Tech Women at AdRoll Group. I had so much fun learning more about my coworkers and am happy to share what I learned with you.&lt;/p&gt;
  5385.  
  5386. &lt;h1 id=&quot;monica-lloyd&quot;&gt;Monica Lloyd&lt;/h1&gt;
  5387. &lt;p&gt;&lt;strong&gt;Business Intelligence Engineer&lt;/strong&gt;&lt;/p&gt;
  5388.  
  5389. &lt;p&gt;&lt;img src=&quot;/images/post_images/tech-women-monica-lloyd.jpg&quot; alt=&quot;Monica Lloyd&quot; /&gt;&lt;/p&gt;
  5390.  
  5391. &lt;p&gt;From a young age, Monica wanted to be a journalist. She was inspired by her father, who had an impressive career as a combat cameraman in Vietnam and later worked for CBS News and 60 Minutes. After graduating from UC Berkeley with a degree in political science, she joined her father’s documentary film production company. She and her father made several films and participated in film festivals.&lt;/p&gt;
  5392.  
  5393. &lt;p&gt;As her father was preparing to retire, Monica looked for new opportunities and challenges. By chance, she was connected with Aaron Bell, founder of AdRoll Group, who was then just building out the company. She was hired to help manage the expansion of the AdRoll Group office. In true startup fashion, Monica also worked the front desk in addition to her other responsibilities. As AdRoll grew, so did Monica’s role, evolving into larger project management duties for the sales team. Through her work with the sales team, she saw a need for and moved her career in the direction of analytical support. Today, Monica works on the Analytics team, providing insights that drive decision making throughout AdRoll Group.&lt;/p&gt;
  5394.  
  5395. &lt;p&gt;&lt;strong&gt;Advice from Monica:&lt;/strong&gt;&lt;/p&gt;
  5396.  
  5397. &lt;p&gt;Monica encourages anyone interested in STEM careers to reach out and find a support system. She notes that she was able to find good mentors that went above and beyond their day jobs to provide guidance and support. Prior to that, Monica wasn’t aware of the variety of STEM careers available. Today, Monica participates in Tech Women at AdRoll Group to help others learn and grow in their tech careers.&lt;/p&gt;
  5398.  
  5399. &lt;h1 id=&quot;farah-khan&quot;&gt;Farah Khan&lt;/h1&gt;
  5400. &lt;p&gt;&lt;strong&gt;Senior Solutions Engineer&lt;/strong&gt;&lt;/p&gt;
  5401.  
  5402. &lt;p&gt;&lt;img src=&quot;/images/post_images/tech-women-farah-khan.jpg&quot; alt=&quot;Farah Khan&quot; /&gt;&lt;/p&gt;
  5403.  
  5404. &lt;p&gt;When she was a kid, Farah wanted to be President of the United States. She would often ask her friends if they’d vote for her (they said yes!). As she grew up, her interests turned toward a career at the FBI. At the University of Maryland, she earned a degree in criminology, with a minor in Arabic. But it wasn’t until she took an Oracle DBA class her senior year that she considered a STEM career.&lt;/p&gt;
  5405.  
  5406. &lt;p&gt;That class on databases ended up being a key moment in her career. After writing her first SQL statement, she was hooked. After graduating from UMD, she worked as a QA engineer and loved the challenge of breaking software. Farah went on to earn a master’s degree in systems engineering. While working on her master’s degree, she also started working as a systems engineer at Dell EMC, leveraging her interpersonal skills with a systems thinking approach to onboard clients and explain the value of internal tech.&lt;/p&gt;
  5407.  
  5408. &lt;p&gt;After Dell EMC, Farah started her own clothing line out of a tiny apartment in NYC, and knew she needed to reach more customers to help them find modest fashion wear. Not knowing anything about marketing, some googling led her to find AdRoll. She wanted to combine her SE background with newfound knowledge about small business marketing challenges, helping other clients on their journeys.&lt;/p&gt;
  5409.  
  5410. &lt;p&gt;Farah now works as a senior solutions engineer at AdRoll Group. At AdRoll, she appreciates not only having mentors but also sponsors, those in power who have advocated for her and given her the chance to try new things. All ideas are entertained, and the resources to execute those ideas made available.&lt;/p&gt;
  5411.  
  5412. &lt;p&gt;Farah says that solving challenging problems is the most interesting part of her STEM career. “These tough problems keep engineers humble,” she says, “and these problems can range from building scalable processes to serve customers more efficiently to making internal tools more accessible to end users.”&lt;/p&gt;
  5413.  
  5414. &lt;p&gt;&lt;strong&gt;Advice from Farah:&lt;/strong&gt;&lt;/p&gt;
  5415.  
  5416. &lt;p&gt;To those interested in STEM careers as engineers, she recommends embracing the imposter syndrome: it means you’re trying new things and growing. A few suggestions from Farah: avoid “analysis paralysis,” go out and do things, and don’t be afraid to get started even if it’s small.&lt;/p&gt;
  5417.  
  5418. &lt;h1 id=&quot;sadie-wilhelm&quot;&gt;Sadie Wilhelm&lt;/h1&gt;
  5419. &lt;p&gt;&lt;strong&gt;Senior Software Engineer&lt;/strong&gt;&lt;/p&gt;
  5420.  
  5421. &lt;p&gt;&lt;img src=&quot;/images/post_images/tech-women-sadie-wilhelm.jpg&quot; alt=&quot;Sadie Wilhelm&quot; /&gt;&lt;/p&gt;
  5422.  
  5423. &lt;p&gt;When she was growing up, Sadie always wanted to be a math teacher. She loved math and was encouraged by her mother, a CPA, to pursue a STEM career. Along the way, she had fantastic teachers and not so fantastic teachers. She saw the difference a great math teacher could have on young minds. She went on to become a math teacher, then ran a tutoring company for several years.&lt;/p&gt;
  5424.  
  5425. &lt;p&gt;Sadie loved working with her students but yearned for larger challenges. Her then boyfriend, now husband, encouraged her to try some online coding courses. It was an instant hit: Sadie knew she would figure out how to be an engineer. Sadie began by learning SQL and was eventually hired as part of the Business Intelligence team at AdRoll Group. Once on the team, she began to automate parts of her job with Python. Wanting to go further, Sadie took a three-month hiatus from work to attend the Hackbright Academy coding boot camp. After Hackbright, Sadie returned to AdRoll Group as an engineer for several years before leaving to try out another company. Nine months later, Sadie came back to work as an engineer on other systems.&lt;/p&gt;
  5426.  
  5427. &lt;p&gt;As a working mother, Sadie has found her experience at AdRoll Group to be better than those of her peers at other companies. After returning from maternity leave, Sadie took advantage of flexible hours and was encouraged to leave early for daycare pickup. The message was, “We trust you,” Sadie says. She continues, “AdRoll Group trusted me to get my work done, and in turn, I trusted AdRoll Group to support me while I took care of my family and did my job well.”&lt;/p&gt;
  5428.  
  5429. &lt;p&gt;&lt;strong&gt;Advice from Sadie:&lt;/strong&gt;&lt;/p&gt;
  5430.  
  5431. &lt;p&gt;Sadie’s favorite part of her engineering career is that she’s never bored. Working on a variety of projects using a plethora of new technologies keeps her engaged. For those interested in STEM careers, she suggests embracing failure instead of fearing it. Her mother encouraged her to study engineering in college, but the fear of failure kept her from pursuing it. Sadie faces failure every single day and says that it’s from failure that she learns, improves, and grows.&lt;/p&gt;
  5432.  
  5433. &lt;p&gt;Want more profiles of Tech Women? Read our &lt;a href=&quot;http://tech.adroll.com/blog/culture/2018/03/22/tech-women-of-adroll-group-part-3.html&quot;&gt;previous entry&lt;/a&gt;.&lt;/p&gt;
  5434.  
  5435. &lt;p&gt;&lt;strong&gt;Interested in working with these and other talented women? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  5436. </description>
  5437.    </item>
  5438.    
  5439.    
  5440.    
  5441.    <item>
  5442.      <title>
  5443. Spot The Discrepancies with Dialyzer for Erlang
  5444. </title>
  5445.      <link>https://tech.nextroll.com/blog/dev/2019/02/19/erlang-dialyzer.html</link>
  5446.      <pubDate>Tue, 19 Feb 2019 00:00:00 -0800</pubDate>
  5447.      <author></author>
  5448.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2019/02/19/erlang-dialyzer</guid>
  5449.      <description>&lt;p&gt;Dialyzer is a great tool to validate Erlang code, but it might slow down your development process if devs are applying it to huge codebases constantly. Particularly if that code was never analyzed with it.&lt;/p&gt;
  5450.  
  5451. &lt;p&gt;This article is our answer to the &lt;em&gt;big question&lt;/em&gt;: &lt;strong&gt;How to start using dialyzer in a huge project where it was never applied before?&lt;/strong&gt;&lt;/p&gt;
  5452.  
  5453. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10-15 minute read&lt;/code&gt;&lt;/p&gt;
  5454.  
  5455. &lt;hr /&gt;
  5456.  
  5457. &lt;p&gt;Continuing with our series of articles about the usage of &lt;a href=&quot;http://www.erlang.org/&quot;&gt;Erlang/OTP&lt;/a&gt; to build our &lt;a href=&quot;/blog/web/2014/04/29/valentino-presents-adrolls-rtb-infrastructure.html&quot;&gt;real-time bidding platform&lt;/a&gt;, we would like to show you now how we added &lt;a href=&quot;http://erlang.org/doc/man/dialyzer.html&quot;&gt;Dialyzer&lt;/a&gt; to our CI pipelines.&lt;/p&gt;
  5458.  
  5459. &lt;p&gt;The main Erlang application for our Real-Time Bidding servers was created &lt;em&gt;way before&lt;/em&gt; &lt;a href=&quot;https://rebar3.org&quot;&gt;rebar3&lt;/a&gt; existed. Performing the task equivalent to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 dialyzer&lt;/code&gt; was not easy and it was also very time consuming.&lt;/p&gt;
  5460.  
  5461. &lt;p&gt;We recently started using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3&lt;/code&gt; to manage our projects and, as part of that process, we decided to finally deal with that bit of technical debt.&lt;/p&gt;
  5462.  
  5463. &lt;p&gt;Of course, it was not as easy as &lt;em&gt;just run dialyzer on the code and remove all the warnings&lt;/em&gt;. When we started, we had approximately &lt;strong&gt;1800&lt;/strong&gt; warnings to deal with, between our main repo and its dependencies. So, we tackled them &lt;em&gt;incrementally&lt;/em&gt;. Let me walk you through our process…&lt;/p&gt;
  5464.  
  5465. &lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
  5466. &lt;p&gt;Before we begin, let’s talk a little bit more about Dialyzer and why it’s so important to use it.&lt;/p&gt;
  5467.  
  5468. &lt;h3 id=&quot;what-is-dialyzer&quot;&gt;What is Dialyzer?&lt;/h3&gt;
  5469. &lt;p&gt;Dialyzer is a &lt;em&gt;discrepancy analyzer&lt;/em&gt; for Erlang/Elixir code. It checks your applications to find &lt;em&gt;discrepancies&lt;/em&gt; such as definite type errors, code that has become dead or unreachable because of programming error, and unnecessary tests, among other things.&lt;/p&gt;
  5470.  
  5471. &lt;h3 id=&quot;how-to-run-dialyzer&quot;&gt;How to run Dialyzer?&lt;/h3&gt;
  5472. &lt;p&gt;Dialyzer can be run from command line (it’s one of the several tools that are shipped with Erlang/OTP) but these days it’s far more common to run it with rebar3, i.e. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 dialyzer&lt;/code&gt;.&lt;/p&gt;
  5473.  
  5474. &lt;h3 id=&quot;why-you-should-use-dialyzer&quot;&gt;Why you should use Dialyzer?&lt;/h3&gt;
  5475. &lt;p&gt;Dialyzer will not warn you about all your errors, but if dialyzer emits a warning about something in your code, you can be sure that there is a bug there. You’ll see more on that bug finding stuff in the paragraphs below.&lt;/p&gt;
  5476.  
  5477. &lt;p&gt;And now… let’s go back to our story!&lt;/p&gt;
  5478.  
  5479. &lt;h1 id=&quot;our-goal&quot;&gt;Our Goal&lt;/h1&gt;
  5480. &lt;blockquote&gt;
  5481.  &lt;p&gt;To be able to consistently reduce the number of discrepancies over time without altering our development speed.&lt;/p&gt;
  5482. &lt;/blockquote&gt;
  5483.  
  5484. &lt;h1 id=&quot;metrics&quot;&gt;Metrics&lt;/h1&gt;
  5485. &lt;p&gt;With that goal in mind, we established a plan of attack and, since you can’t improve what you can’t measure, our first step was to &lt;em&gt;instrument the number of dialyzer warnings&lt;/em&gt; so we could keep an eye on it and, hopefully, watch it go down to zero eventually.&lt;/p&gt;
  5486.  
  5487. &lt;p&gt;We use &lt;a href=&quot;https://www.datadoghq.com/&quot;&gt;Datadog&lt;/a&gt; for our real-time metrics. In what may be considered a severe misuse of this tool, we decided to write a simple bash script using the datadog agent to report the number of warnings found by dialyzer. The idea was to provide a nice way to visualize our progress relative to our goal stated above. But an important question popped up: how do we find the number of warnings?&lt;/p&gt;
  5488.  
  5489. &lt;h4 id=&quot;the-warnings-file&quot;&gt;The Warnings File&lt;/h4&gt;
  5490. &lt;p&gt;Luckily for us &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 dialyzer&lt;/code&gt; generates a file with the list of warnings, called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_build/${REBAR_PROFILE}/${OTP-VERSION}.dialyzer_warnings&lt;/code&gt; that looks like this:&lt;/p&gt;
  5491.  
  5492. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;erl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;371&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;The&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;call&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;your_other_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  5493.  
  5494. &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;erl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;470&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;The&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;call&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;another_one&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;thing&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5495.  
  5496. &lt;p&gt;So, this is what our instrumentation script looks like:&lt;/p&gt;
  5497.  
  5498. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;#!/bin/bash&lt;/span&gt;
  5499. &lt;span class=&quot;nv&quot;&gt;WARNINGS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;code_quality.dialyzer.discrepancies:&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sed&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;/^$/d&apos;&lt;/span&gt; _build/default/&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;.dialyzer_warnings | &lt;span class=&quot;nb&quot;&gt;wc&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-l&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;tr&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;[:space:]&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;|g|#shell&quot;&lt;/span&gt;
  5500. &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;WARNINGS&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt; | nc &lt;span class=&quot;nt&quot;&gt;-4u&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-w0&lt;/span&gt; localhost 8125&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5501.  
  5502. &lt;p&gt;Let’s read that backwards: the script reports the number of dialyzer discrepancies by echoing the contents of the environment variable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WARNINGS&lt;/code&gt; to the datadog agent listening on port 8125 on our machine. The datadog agent in turn ships it over to datadog servers. The contents of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WARNING&lt;/code&gt; variable are simply the name of the metric (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;code_quality.dialyzer.discrepancies&lt;/code&gt;), following by the count.
  5503. The count is determined by reading the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.dialyzer_warnings&lt;/code&gt; file, using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sed&lt;/code&gt; to remove empty lines, piping that to wc to count lines and cleaning up wc’s output of spaces with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tr&lt;/code&gt;.&lt;/p&gt;
  5504.  
  5505. &lt;h4 id=&quot;the-number-of-warnings&quot;&gt;The Number of Warnings&lt;/h4&gt;
  5506. &lt;p&gt;We added that to our Makefile target for dialyzer, as follows:&lt;/p&gt;
  5507.  
  5508. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-makefile&quot; data-lang=&quot;makefile&quot;&gt;&lt;span class=&quot;nl&quot;&gt;dialyzer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;##&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; Check the project with dialyzer&lt;/span&gt;
  5509.    &lt;span class=&quot;err&quot;&gt;@$(REBAR3)&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;dialyzer&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;./scripts/instrument_dialyzer.sh&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5510.  
  5511. &lt;p&gt;But then, when we started watching that metric, we noticed something odd…&lt;/p&gt;
  5512.  
  5513. &lt;p&gt;&lt;img src=&quot;/images/post_images/dialyzer/datadog_spikes.png&quot; alt=&quot;Datadog Spikes&quot; /&gt;&lt;/p&gt;
  5514.  
  5515. &lt;p&gt;The numbers seemed right in general, but what about those odd spikes every now and then? Turns out, they were generated when running dialyzer for the first time (i.e. on a clean clone of the project). That’s because, when &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 dialyzer&lt;/code&gt; runs for the first time for a project, it generates the &lt;a href=&quot;http://erlang.org/doc/apps/dialyzer/dialyzer_chapter.html#the-persistent-lookup-table&quot;&gt;persistent lookup table (PLT)&lt;/a&gt; including the project dependencies and those dependencies generate some dialyzer warnings of their own. Those warnings are not generated again once you already have a plt, so the get the actual number of discrepancies we want for our main project, we need to run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 dialyzer&lt;/code&gt; once the PLT is already generated. That’s why our Makefile target actually looks like the following.&lt;/p&gt;
  5516.  
  5517. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-makefile&quot; data-lang=&quot;makefile&quot;&gt;&lt;span class=&quot;nl&quot;&gt;dialyzer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;##&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; Check the project with dialyzer&lt;/span&gt;
  5518.    &lt;span class=&quot;err&quot;&gt;@$(REBAR3)&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;dialyzer&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;$(REBAR3)&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;dialyzer&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;./scripts/instrument_dialyzer.sh&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5519.  
  5520. &lt;h1 id=&quot;organizing-the-work&quot;&gt;Organizing the Work&lt;/h1&gt;
  5521. &lt;p&gt;Once we had that metric in place, we now wanted to see it go down and to achieve that, our first idea was to write tickets to fix the warnings. But removing all 1800 warnings together was, of course, impossible. Can you imagine reviewing such a massive pull request?
  5522. That’s why, using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dialyzer_warnings&lt;/code&gt; file again, we decided to turn it into tickets. We use &lt;a href=&quot;https://www.atlassian.com/software/jira&quot;&gt;JIRA&lt;/a&gt; to organize our work, and it comes with a handy &lt;a href=&quot;https://marketplace.atlassian.com/apps/6398/jira-command-line-interface-cli&quot;&gt;CLI&lt;/a&gt; that we used to write an escript that groups warnings and creates a ticket for each module. I won’t paste the whole script here, but the &lt;em&gt;meaty&lt;/em&gt; part is this:&lt;/p&gt;
  5523.  
  5524. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;nf&quot;&gt;build_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Mod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  5525.    &lt;span class=&quot;nb&quot;&gt;binary_to_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;iolist_to_binary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;
  5526.        &lt;span class=&quot;s&quot;&gt;&quot;jira create &quot;&lt;/span&gt;
  5527.        &lt;span class=&quot;s&quot;&gt;&quot;--noedit &quot;&lt;/span&gt;
  5528.        &lt;span class=&quot;s&quot;&gt;&quot;--override=&apos;components:boodah&apos; &quot;&lt;/span&gt;
  5529.        &lt;span class=&quot;s&quot;&gt;&quot;--override=&apos;summary: There are &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5530.        &lt;span class=&quot;nb&quot;&gt;integer_to_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;
  5531.        &lt;span class=&quot;s&quot;&gt;&quot; dialyzer warnings on &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Mod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&apos; &quot;&lt;/span&gt;
  5532.        &lt;span class=&quot;s&quot;&gt;&quot;--override=&apos;description: The warnings reported&quot;&lt;/span&gt;
  5533.        &lt;span class=&quot;s&quot;&gt;&quot; at the time of this writing are:&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;
  5534.        &lt;span class=&quot;s&quot;&gt;&quot;  {code:Erlang} &lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;  {code}&apos;&quot;&lt;/span&gt;
  5535.    &lt;span class=&quot;p&quot;&gt;])).&lt;/span&gt;
  5536.  
  5537. &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([])&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  5538.    &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  5539.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;read_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;WarningFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  5540.  
  5541.    &lt;span class=&quot;nv&quot;&gt;Warnings&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
  5542.        &lt;span class=&quot;nf&quot;&gt;group_by_mod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5543.            &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;filtermap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5544.                &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parse_line&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5545.                    &lt;span class=&quot;nn&quot;&gt;binary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;global&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;trim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  5546.                &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5547.            &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5548.        &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  5549.  
  5550.    &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5551.        &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;run_command&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5552.        &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;build_command&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Warnings&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;
  5553.  
  5554.    &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;DONE!&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5555.  
  5556. &lt;h1 id=&quot;keeping-discrepancies-at-bay&quot;&gt;Keeping Discrepancies at Bay&lt;/h1&gt;
  5557.  
  5558. &lt;p&gt;Of course, we didn’t stop the development process to put everyone to work on the tickets generated with that script.&lt;/p&gt;
  5559.  
  5560. &lt;p&gt;Well, actually…&lt;/p&gt;
  5561.  
  5562. &lt;h4 id=&quot;hackweek&quot;&gt;HackWeek!&lt;/h4&gt;
  5563.  
  5564. &lt;p&gt;One of the things that all rollers enjoy working here are unquestionably the &lt;a href=&quot;https://blog.adroll.com/news/adroll-named-sf-business-times-best-places-to-work-for-the-fourth-time&quot;&gt;HackWeeks&lt;/a&gt;. And during the last one of 2018, a team of 4 developers decided to &lt;em&gt;remove as many dialyzer warnings as possible&lt;/em&gt; from our code.&lt;/p&gt;
  5565.  
  5566. &lt;p&gt;And removing warnings… we did! From the original 1800 we went down to a whooping &lt;strong&gt;300&lt;/strong&gt; without affecting performance nor functionality in the slightest!!&lt;/p&gt;
  5567.  
  5568. &lt;p&gt;Oh, and in that process, we eliminated tons of dead code and fixed &lt;strong&gt;11&lt;/strong&gt; bugs that were still undetected by our tests!!&lt;/p&gt;
  5569.  
  5570. &lt;p&gt;But after that week, we wanted to ensure that we didn’t start adding new discrepancies as we moved on.&lt;/p&gt;
  5571.  
  5572. &lt;h4 id=&quot;ci-additions&quot;&gt;CI Additions&lt;/h4&gt;
  5573.  
  5574. &lt;p&gt;We didn’t want to require our developers to run dialyzer each time (although we strongly recommended it) since our original goal explicitely included &lt;em&gt;not altering our development speed&lt;/em&gt;. That’s why we added dialyzer to our CI instead!&lt;/p&gt;
  5575.  
  5576. &lt;p&gt;But, of course, we still had 300 warnings. We couldn’t &lt;em&gt;require&lt;/em&gt; a clean run of dialyzer for each pull request. What we decided instead was to reject PRs if they included &lt;strong&gt;new dialyzer warnigns&lt;/strong&gt;.&lt;/p&gt;
  5577.  
  5578. &lt;p&gt;We use &lt;a href=&quot;https://buildkite.com&quot;&gt;Buildkite&lt;/a&gt; for CI, so (using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*.dialyzer_warnings&lt;/code&gt; once more) we added a new pipeline that generates a &lt;em&gt;normalized&lt;/em&gt; warnings list each time anything is pushed to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;master&lt;/code&gt;:&lt;/p&gt;
  5579.  
  5580. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot; data-lang=&quot;yaml&quot;&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Run&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;dialyzer&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;upload&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;the&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;resulting&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;warnings&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s3&apos;&lt;/span&gt;
  5581.  &lt;span class=&quot;na&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  5582.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;make harsh_clean&lt;/span&gt;
  5583.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;make dialyzer&lt;/span&gt;
  5584.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;normalize_warnings.sh _build/default/*.dialyzer_warnings master.dialyzer_warnings&lt;/span&gt;
  5585.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;aws s3 cp master.dialyzer_warnings s3://...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5586.  
  5587. &lt;p&gt;And that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;normalize_warnings.sh&lt;/code&gt; script? It looks like this:&lt;/p&gt;
  5588.  
  5589. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;#!/bin/bash&lt;/span&gt;
  5590. &lt;span class=&quot;nb&quot;&gt;set&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt;
  5591.  
  5592. &lt;span class=&quot;nb&quot;&gt;cp&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt; pre.dw
  5593. &lt;span class=&quot;nb&quot;&gt;sed&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;s|&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;pwd&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;||g&quot;&lt;/span&gt; pre.dw | &lt;span class=&quot;nb&quot;&gt;sed&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;/^$/d&apos;&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;sort&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$2&lt;/span&gt;
  5594. &lt;span class=&quot;nb&quot;&gt;rm &lt;/span&gt;pre.dw&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5595.  
  5596. &lt;p&gt;Basically, three steps:&lt;/p&gt;
  5597. &lt;ol&gt;
  5598.  &lt;li&gt;Remove the current folder from the paths in the file&lt;/li&gt;
  5599.  &lt;li&gt;Remove empty lines&lt;/li&gt;
  5600.  &lt;li&gt;Sort them&lt;/li&gt;
  5601. &lt;/ol&gt;
  5602.  
  5603. &lt;p&gt;Once we had that in place, we only needed to update our existing PR verification pipeline to generate, normalize and compare its list of warnings with the one from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;master&lt;/code&gt;. We did that by extending our already existing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;make dialyzer&lt;/code&gt; as follows and verifying that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;new.dialyzer_warnings&lt;/code&gt; is empty on Buildkite.&lt;/p&gt;
  5604.  
  5605. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-makefile&quot; data-lang=&quot;makefile&quot;&gt;&lt;span class=&quot;nl&quot;&gt;dialyzerfast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;##&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; Check the project with dialyzer&lt;/span&gt;
  5606.    &lt;span class=&quot;err&quot;&gt;@$(REBAR3)&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;dialyzer&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;$(REBAR3)&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;dialyzer&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;./scripts/instrument_dialyzer.sh&lt;/span&gt;
  5607.  
  5608. &lt;span class=&quot;nl&quot;&gt;new_dialyzer_warnings&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;##&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; Verifies if there are new dialyzer warnings&lt;/span&gt;
  5609.    &lt;span class=&quot;err&quot;&gt;normalize_warnings.sh&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;_build/default/*.dialyzer_warnings&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;branch.dialyzer_warnings&lt;/span&gt;
  5610.    &lt;span class=&quot;nl&quot;&gt;@aws s3 cp s3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;//.../master.dialyzer_warnings .&lt;/span&gt;
  5611.    &lt;span class=&quot;err&quot;&gt;comm&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;-2&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;-3&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;branch.dialyzer_warnings&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;master.dialyzer_warnings&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;new.dialyzer_warnings&lt;/span&gt;
  5612.  
  5613. &lt;span class=&quot;nl&quot;&gt;dialyzer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;##&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; Check the project with dialyzer and report new warnings&lt;/span&gt;
  5614. &lt;span class=&quot;nl&quot;&gt;dialyzer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;dialyzerfast new_dialyzer_warnings&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5615.  
  5616. &lt;h4 id=&quot;caveat&quot;&gt;Caveat&lt;/h4&gt;
  5617.  
  5618. &lt;p&gt;Keen-eyed folks may notice that since we’re comparing files with line-numbers in them, we’re not exactly detecting &lt;em&gt;just&lt;/em&gt; new warnings. If your changes alter the line numbers of lines with &lt;em&gt;existing&lt;/em&gt; warnings, they will be reported as new ones.
  5619. When we noticed that we decided that it was fair. The rule was: &lt;em&gt;If you modify a module with dialyzer warnings, you should fix them as part of your PR, too.&lt;/em&gt;&lt;/p&gt;
  5620.  
  5621. &lt;h1 id=&quot;caching-plts&quot;&gt;Caching PLTs&lt;/h1&gt;
  5622.  
  5623. &lt;p&gt;Now that we had added dialyzer to our CI we had not slowed down our development time &lt;em&gt;when working on our computers&lt;/em&gt; but we had still slowed down our CI times considerably. So, to regain that lost speed, we needed to avoid recompiling the PLTs in each run. Luckily for us, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3&lt;/code&gt; places the PLTs in a very convenient and configurable location. So, we adjusted our buildkite pipelines like you can see below…&lt;/p&gt;
  5624.  
  5625. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot; data-lang=&quot;yaml&quot;&gt;&lt;span class=&quot;nn&quot;&gt;...&lt;/span&gt;
  5626.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;aws s3 sync s3://.../ . --exclude &quot;*&quot; --include &quot;*plt&quot; --no-follow-symlinks&lt;/span&gt;
  5627.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;make dialyzer&lt;/span&gt;
  5628.    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;aws s3 sync . s3://.../ --exclude &quot;*&quot; --include &quot;*plt&quot; --no-follow-symlinks&lt;/span&gt;
  5629. &lt;span class=&quot;nn&quot;&gt;...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5630.  
  5631. &lt;p&gt;We basically synched our plts with s3 each time. And that was it! CI runs fast as usual, detects new warnings and we all keep improving our code constantly until we eventually get to 0 warnings.&lt;/p&gt;
  5632.  
  5633. &lt;p&gt;Look, this is our progress last week…&lt;/p&gt;
  5634.  
  5635. &lt;p&gt;&lt;img src=&quot;/images/post_images/dialyzer/datadog_week.png&quot; alt=&quot;Datadog Week&quot; /&gt;&lt;/p&gt;
  5636.  
  5637. &lt;p&gt;We went from 300 to &lt;strong&gt;225&lt;/strong&gt; in just a week! And the most important part is that we effortlessly included dialyzer as part of our development process forever, thus increasing the quality of our code significantly.&lt;/p&gt;
  5638.  
  5639. &lt;p&gt;I hope this story inspires you to do the same in your project and be as happy and as proud of your code as we are of ours :)&lt;/p&gt;
  5640.  
  5641. &lt;hr /&gt;
  5642.  
  5643. &lt;p&gt;&lt;strong&gt;Do you enjoy building high-quality large-scale systems? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  5644.  
  5645. </description>
  5646.    </item>
  5647.    
  5648.    
  5649.    
  5650.    <item>
  5651.      <title>
  5652. AdRoll Group Loves the Grace Hopper Conference
  5653. </title>
  5654.      <link>https://tech.nextroll.com/blog/culture/2019/02/14/grace-hopper-recap.html</link>
  5655.      <pubDate>Thu, 14 Feb 2019 00:00:00 -0800</pubDate>
  5656.      <author></author>
  5657.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2019/02/14/grace-hopper-recap</guid>
  5658.      <description>&lt;p&gt;Grace Hopper is happening on October 2-4 2019 and
  5659. the &lt;a href=&quot;https://ghc.anitab.org/calendar/2019-grace-hopper-celebration/&quot;&gt;Call for Participation, Abi Award nominations, and Scolarship applications are now open&lt;/a&gt;!
  5660. Here is why we love Grace Hopper.&lt;/p&gt;
  5661.  
  5662. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10-15 minute read&lt;/code&gt;&lt;/p&gt;
  5663.  
  5664. &lt;hr /&gt;
  5665.  
  5666. &lt;p&gt;In September 2018, sixteen Rollers joined around 20,000 attendees from all over the
  5667. world for Grace Hopper, the largest gathering of women in computing. Although women
  5668. make up more than half of the U.S. workforce, we make up less than 20 percent of the
  5669.  tech jobs, which makes this conference extra magical to the women in attendance.&lt;/p&gt;
  5670.  
  5671. &lt;p&gt;&lt;img src=&quot;/images/post_images/grace_hopper_recap/booth.png&quot; alt=&quot;Adroll Group recruiting table&quot; /&gt;&lt;/p&gt;
  5672.  
  5673. &lt;p&gt;As Julie Zhou, our Director of Product Management described it, “I found it inspiring to be amongst
  5674. so many ambitious and talented women in one place…many people have a picture in their mind
  5675. of what a tech conference looks like and that picture tends to have lots of young, white men there.”&lt;/p&gt;
  5676.  
  5677. &lt;p&gt;&lt;img src=&quot;/images/post_images/grace_hopper_recap/wanted.png&quot; alt=&quot;Rubi, Farah, Julie and Andrew&quot; /&gt;&lt;/p&gt;
  5678.  
  5679. &lt;p&gt;Although there are plenty of opportunities to network, Grace Hopper is also an opportunity to learn.
  5680. There were ten technical tracks that covered topics all the way from product to Computer Systems
  5681. Engineering and Computer Science. Jessica Grist, Sr. Software Engineer at AdRoll Group, said she “really
  5682. liked that there was the option of a 20-minute quick introduction to a specific topic or a longer 1-hour
  5683. deep dive and that an in-depth technical talk could coexist with work-life balance talks. More than
  5684. anything, she loved the recruiting and interviewing efforts, focused on attracting and networking
  5685. with women in technology.”&lt;/p&gt;
  5686.  
  5687. &lt;p&gt;&lt;img src=&quot;/images/post_images/grace_hopper_recap/workshop.png&quot; alt=&quot;Miriam in a workshop&quot; /&gt;&lt;/p&gt;
  5688.  
  5689. &lt;p&gt;In the conference’s spirit to get and keep women in the computing workforce, we all happily spent
  5690. countless hours chatting with hundreds of women interested in careers across data science,
  5691. engineering, and product. “It was inspiring to see so much female talent in one place…
  5692. Also special for me was that my mom was there as an engineering lead at her company (UBS
  5693. Financial Services)… Her presence gave me a better appreciation for the women from the previous
  5694. generation,” said Kelly Eng, Sr. Product Manager at AdRoll Group.&lt;/p&gt;
  5695.  
  5696. &lt;p&gt;&lt;img src=&quot;/images/post_images/grace_hopper_recap/team.png&quot; alt=&quot;team2&quot; /&gt;&lt;/p&gt;
  5697.  
  5698. &lt;p&gt;We also appreciated the men that came to represent AdRoll with us! Patrick, our SVP of Engineering,
  5699. “was excited by all their enthusiasm and hoped he could make this a more welcoming industry.”
  5700. Attending a conference where men were the stark minority resulted in many of them empathizing
  5701.  with our experience as women in most technology conferences.&lt;/p&gt;
  5702.  
  5703. &lt;p&gt;&lt;img src=&quot;/images/post_images/grace_hopper_recap/team2.png&quot; alt=&quot;team2&quot; /&gt;&lt;/p&gt;
  5704.  
  5705. &lt;p&gt;Overall, the enthusiasm during our recruiting efforts by our entire team throughout the
  5706. conference was a sincere reflection of the commitment to diversity and inclusion that I
  5707. have happily seen continuously nurtured throughout my year here at AdRoll Group.&lt;/p&gt;
  5708.  
  5709. &lt;p&gt;As if that wasn’t exciting enough, we successfully hired Kruti Chauhan, who shared,
  5710. “meeting with the Adroll Group Team, I learned about their support for women in their
  5711. engineering teams… it was indeed a celebration to bag an offer from AdRoll.”&lt;/p&gt;
  5712.  
  5713. &lt;p&gt;I can’t wait to continue to see the fruition of our investment in diversity come thrive this year!&lt;/p&gt;
  5714.  
  5715. &lt;hr /&gt;
  5716.  
  5717. &lt;p&gt;&lt;strong&gt;Do you enjoy working on a company that cares about a diverse and inclusive world? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  5718.  
  5719. </description>
  5720.    </item>
  5721.    
  5722.    
  5723.    
  5724.    <item>
  5725.      <title>
  5726. Managing DynamoDB Autoscaling with Lambda and Cloudwatch
  5727. </title>
  5728.      <link>https://tech.nextroll.com/blog/dev/2019/02/05/dynamodb-managed-autoscaling.html</link>
  5729.      <pubDate>Tue, 05 Feb 2019 00:00:00 -0800</pubDate>
  5730.      <author></author>
  5731.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2019/02/05/dynamodb-managed-autoscaling</guid>
  5732.      <description>&lt;p&gt;Autoscaling was a great addition to DynamoDB and it lets you forget about assigned capacity. Here’s how we implemented our own algorithm to improve on this idea.&lt;/p&gt;
  5733.  
  5734. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10-15 minute read&lt;/code&gt;&lt;/p&gt;
  5735.  
  5736. &lt;hr /&gt;
  5737.  
  5738. &lt;p&gt;DynamoDB is a NoSQL key-value database service provided by AWS. We use it at the core of many of our systems, since the consistent read latency is awesome for realtime applications. If you are not entirely familiar with it, it might be worth refreshing the basics with the &lt;a href=&quot;https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html&quot;&gt;AWS Documentation&lt;/a&gt;&lt;/p&gt;
  5739.  
  5740. &lt;p&gt;One of the first things you learn as a new DynamoDB user is that you have to manage your table capacity in order to keep both your needs and costs in check, which can be harder than it sounds.
  5741. DynamoDB tables and indexes offer 2 core metrics that you can use to achieve this: provisioned and consumed capacity.&lt;/p&gt;
  5742.  
  5743. &lt;p&gt;Initially, the only way around this problem was to assign the capacity manually, based on experience and traffic. If your traffic varied, you ended up having some margin to absorb variations, which leads to wasted capacity.
  5744. Eventually, DynamoDB ended up introducing an autoscaling feature, which lets you set a relation of consumed to provisioned capacity, up to 70%. This means that if you set your table for 70% and you consume 7 units, you’ll get 10 provisioned units.&lt;/p&gt;
  5745.  
  5746. &lt;h2 id=&quot;problems-we-faced-with-the-default-autoscaling-algorithm&quot;&gt;Problems we faced with the default autoscaling algorithm&lt;/h2&gt;
  5747.  
  5748. &lt;p&gt;The default autoscaling algorithm provided by AWS works by setting up a series of alarms that trigger if the capacity is above the defined rate for more than 5 minutes (please visit &lt;a href=&quot;https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html&quot;&gt;the docs for autoscaling&lt;/a&gt; if you are interested in the details).&lt;/p&gt;
  5749.  
  5750. &lt;p&gt;&lt;img src=&quot;/images/post_images/dynamodb_autoscaling/autoscaling_graph.png&quot; alt=&quot;Stock Autoscaling Architecture&quot; /&gt;&lt;/p&gt;
  5751.  
  5752. &lt;p&gt;This poses a problem for an application with varying and bursty workload: the table will scale up only based on consumption, triggering these alarms time after time, until it reaches the desired level. &lt;strong&gt;Ideally, the table should scale based on the number of requests that we are making , not the number of requests that are successful.&lt;/strong&gt;
  5753. Additionally, at the time of implementing this algorithm, the DynamoDB capacity could not be brought down automatically if the consumption was exactly zero, which can happen if you write to your table in batch instead of realtime, for example.&lt;/p&gt;
  5754.  
  5755. &lt;p&gt;From the AWS docs:&lt;/p&gt;
  5756.  
  5757. &lt;p&gt;`
  5758. Currently, Auto Scaling does not scale down your provisioned capacity if your table’s consumed capacity becomes zero.
  5759. As a workaround, you can send requests to the table until Auto Scaling scales down to the minimum capacity,
  5760. or change the policy to reduce the maximum provisioned capacity to be the same as the minimum provisioned capacity.
  5761. `&lt;/p&gt;
  5762.  
  5763. &lt;p&gt;This meant that, when enabling autoscaling, tables that were read in realtime, but written to in batch, still needed manual intervention to bring the write capacity down after our jobs were done writing.
  5764. Gladly, the DynamoDB team has recently fixed this issue, so at the time of writing, tables now downscale on their own.
  5765. DynamoDB tables also have a hidden reserved burst capacity metric, which can be consumed to absorb traffic spikes (but it’s also at the disposal of DynamoDB for internal operations. More on that &lt;a href=&quot;https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-design.html#bp-partition-key-throughput-bursting&quot;&gt;here&lt;/a&gt;).
  5766. Another interesting point that might bite users is that capacity decreases are an expensive operation for AWS, so they’re limited.&lt;/p&gt;
  5767.  
  5768. &lt;p&gt;From the AWS docs:&lt;/p&gt;
  5769.  
  5770. &lt;p&gt;`
  5771. For every table and global secondary index in an UpdateTable operation, you can decrease ReadCapacityUnits or WriteCapacityUnits (or both).
  5772. The new settings do not take effect until the UpdateTable operation is complete. A decrease is allowed up to four times any time per day.
  5773. A day is defined according to the GMT time zone. Additionally, if there was no decrease in the past hour, an additional decrease is allowed, effectively bringing the maximum number of decreases in a day to 27 times (4 decreases in the first hour, and 1 decrease for each of the subsequent 1-hour windows in a day)
  5774. `&lt;/p&gt;
  5775.  
  5776. &lt;p&gt;The number of decreases cited in the documentation can be achieved under very special conditions, since you need to have 4 decreases in the first hour of the day plus one for each of the remaining hours, for a total of 4 (first hour) + 23 (1 hourly) = 27. As you can imagine, getting to this number is rare and not entirely efficient.
  5777. You can read more about these limits &lt;a href=&quot;https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html#limits-decreasing-provisioned-throughput&quot;&gt;here&lt;/a&gt;&lt;/p&gt;
  5778.  
  5779. &lt;h2 id=&quot;initial-approach&quot;&gt;Initial approach&lt;/h2&gt;
  5780.  
  5781. &lt;p&gt;Our initial approach was to manage capacity as fixed steps for batch loads:&lt;/p&gt;
  5782.  
  5783. &lt;ul&gt;
  5784.  &lt;li&gt;
  5785.    &lt;p&gt;Fixed step for read capacity
  5786. &lt;img src=&quot;/images/post_images/dynamodb_autoscaling/autoscaling_initial_read.png&quot; alt=&quot;Initial Read Approach&quot; /&gt;&lt;/p&gt;
  5787.  &lt;/li&gt;
  5788.  &lt;li&gt;
  5789.    &lt;p&gt;Fixed step for write capacity
  5790. &lt;img src=&quot;/images/post_images/dynamodb_autoscaling/autoscaling_initial_write.png&quot; alt=&quot;Initial Write Approach&quot; /&gt;&lt;/p&gt;
  5791.  &lt;/li&gt;
  5792. &lt;/ul&gt;
  5793.  
  5794. &lt;p&gt;While this is better than keeping the capacity up the whole time, you’ll realize that a lot of the time the assigned capacity is greater than needed and that it doesn’t go down as fast as it should.
  5795. This led to an acceptable performance for the batch writes, but at the cost of limiting our options on when to run the jobs. As we started adding more jobs to the mix that either wrote to or read from these tables, we started to need this to be managed automatically.&lt;/p&gt;
  5796.  
  5797. &lt;h2 id=&quot;helping-the-autoscaling-algorithm&quot;&gt;Helping the autoscaling algorithm&lt;/h2&gt;
  5798.  
  5799. &lt;p&gt;We really wanted to use autoscaling for these tables, as the step increase was too wasteful.
  5800. We came up with an idea to add a lambda function that would run every 5 minutes to check if tables were on the “0 capacity consumed state” for at least a few minutes, and bring the capacity down if that was the case.
  5801. We were finally able to use the stock algorithm for these tables at this point, which led to a nice cost drop.&lt;/p&gt;
  5802.  
  5803. &lt;p&gt;&lt;img src=&quot;/images/post_images/dynamodb_autoscaling/autoscaling_manager_graph.png&quot; alt=&quot;Autoscaling lambda Architecture&quot; /&gt;&lt;/p&gt;
  5804.  
  5805. &lt;p&gt;However, we still had the issue of tables rising capacity very slowly and not really paying attention to the amount of requests that were being rejected.
  5806. This caused some of our operations (for example a full daily scan, or millions of requests for a batch job) to take hours of EMR/EC2 time.&lt;/p&gt;
  5807.  
  5808. &lt;h2 id=&quot;replacing-the-autoscaling-algorithm&quot;&gt;Replacing the autoscaling algorithm&lt;/h2&gt;
  5809.  
  5810. &lt;p&gt;Finally, building upon the lambda idea, we decided to replace the autoscaling algorithm entirely.
  5811. We designed a new version of the autoscaling lambda that uses tags and consumed and provisioned metrics like the last one, but that adds a new type of metric: &lt;strong&gt;the throttled requests&lt;/strong&gt;.&lt;/p&gt;
  5812.  
  5813. &lt;p&gt;Under normal conditions the algorithm works exactly the same as the normal autoscaling algorithm.
  5814. When throttling is present, the response of the algorithm is to increase capacity more aggressively, at fixed steps (we plan to make this proportional soon!).
  5815. Additionally, this algorithm works differently in the case of increases and decreases, which plays a little bit better with the limited decreases during the day.
  5816. While 27 decreases seems like a lot, the stock algorithm will lower the capacity as soon as the threshold is met, which can happen several times over the course of a few hours.&lt;/p&gt;
  5817.  
  5818. &lt;ul&gt;
  5819.  &lt;li&gt;Managed autoscaling for read capacity and how it relates to throttling&lt;/li&gt;
  5820. &lt;/ul&gt;
  5821.  
  5822. &lt;p&gt;&lt;img src=&quot;/images/post_images/dynamodb_autoscaling/autoscaling_read.png&quot; alt=&quot;Managed autoscaling read&quot; /&gt;
  5823. &lt;img src=&quot;/images/post_images/dynamodb_autoscaling/autoscaling_read_throttling.png&quot; alt=&quot;Managed autoscaling read throttling&quot; /&gt;&lt;/p&gt;
  5824.  
  5825. &lt;ul&gt;
  5826.  &lt;li&gt;Managed autoscaling for write capacity and how it relates to throttling&lt;/li&gt;
  5827. &lt;/ul&gt;
  5828.  
  5829. &lt;p&gt;&lt;img src=&quot;/images/post_images/dynamodb_autoscaling/autoscaling_write.png&quot; alt=&quot;Managed autoscaling write&quot; /&gt;
  5830. &lt;img src=&quot;/images/post_images/dynamodb_autoscaling/autoscaling_write_throttling.png&quot; alt=&quot;Managed autoscaling write throttling&quot; /&gt;&lt;/p&gt;
  5831.  
  5832. &lt;p&gt;Here is where we detected our costs for our batch tables dropping to around 30% of the initial cost.&lt;/p&gt;
  5833.  
  5834. &lt;h2 id=&quot;example-algorithm&quot;&gt;Example algorithm&lt;/h2&gt;
  5835.  
  5836. &lt;p&gt;The algorithm we are implementing for the autoscaling lambda is fairly simple and written in Python 3, using &lt;a href=&quot;https://boto3.amazonaws.com/v1/documentation/api/latest/index.html&quot;&gt;boto3&lt;/a&gt;.&lt;/p&gt;
  5837.  
  5838. &lt;p&gt;First, we use the &lt;a href=&quot;https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/resourcegroupstaggingapi.html&quot;&gt;tagging client&lt;/a&gt; to figure out which tables we want to target:&lt;/p&gt;
  5839.  
  5840. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;resource_groups_tagging_client&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_resources&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5841.    &lt;span class=&quot;n&quot;&gt;TagFilters&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;
  5842.        &lt;span class=&quot;s&quot;&gt;&apos;Key&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;autoscaling_manager_enabled&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;Values&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;true&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  5843.    &lt;span class=&quot;p&quot;&gt;}],&lt;/span&gt;
  5844.    &lt;span class=&quot;n&quot;&gt;ResourceTypeFilters&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;dynamodb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  5845. &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5846. &lt;span class=&quot;n&quot;&gt;table_names&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;ResourceARN&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;/&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  5847.               &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;ResourceTagMappingList&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
  5848.               &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;ResourceTagMappingList&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[])&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5849.  
  5850. &lt;p&gt;The tag system could be used to manage other parameters as well, for example max or min settings.
  5851. Then, we make sure we are not stepping on a table that is using the AWS algorithm instead, using the &lt;a href=&quot;https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/autoscaling.html&quot;&gt;application autoscaling client&lt;/a&gt;:&lt;/p&gt;
  5852.  
  5853. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;autoscaling_client&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;describe_scaling_policies&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5854.    &lt;span class=&quot;n&quot;&gt;ServiceNamespace&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;dynamodb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5855.    &lt;span class=&quot;n&quot;&gt;ResourceId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;table/{}&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5856. &lt;span class=&quot;n&quot;&gt;is_autoscaling_enabled&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;ScalingPolicies&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5857.  
  5858. &lt;p&gt;Once we have the list of tables, we retrieve their definitions and current setup from the &lt;a href=&quot;https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html&quot;&gt;dynamo client&lt;/a&gt;:&lt;/p&gt;
  5859.  
  5860. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;table_definition_and_settings&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dynamo_client&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;describe_table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5861.    &lt;span class=&quot;n&quot;&gt;TableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Table&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5862.  
  5863. &lt;p&gt;Up to this point, we already know the structure of the table and its currently provisioned capacity, so we use the &lt;a href=&quot;https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/cloudwatch.html&quot;&gt;cloudwatch client&lt;/a&gt; to retrieve historic data from the past minutes about how the table is being used and how many requests have been throttled:&lt;/p&gt;
  5864.  
  5865. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;end_date&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;datetime&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;datetime&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utcnow&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;second&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5866.    &lt;span class=&quot;n&quot;&gt;microsecond&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5867. &lt;span class=&quot;n&quot;&gt;start_date&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end_date&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;datetime&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timedelta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minutes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5868. &lt;span class=&quot;n&quot;&gt;metric&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cloudwatch_resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;AWS/DynamoDB&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5869.    &lt;span class=&quot;n&quot;&gt;metric_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5870. &lt;span class=&quot;n&quot;&gt;dimensions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Name&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;TableName&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;Value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  5871. &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_statistics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5872.    &lt;span class=&quot;n&quot;&gt;Dimensions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dimensions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5873.    &lt;span class=&quot;n&quot;&gt;StartTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5874.    &lt;span class=&quot;n&quot;&gt;EndTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end_date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5875.    &lt;span class=&quot;n&quot;&gt;Period&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5876.    &lt;span class=&quot;n&quot;&gt;Statistics&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Sum&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
  5877. &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5878. &lt;span class=&quot;n&quot;&gt;data_points&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  5879.    &lt;span class=&quot;n&quot;&gt;calendar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timegm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Timestamp&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timetuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()):&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data_point&lt;/span&gt;
  5880.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data_point&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Datapoints&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  5881. &lt;span class=&quot;n&quot;&gt;metric_report&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
  5882. &lt;span class=&quot;n&quot;&gt;current_date&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start_date&lt;/span&gt;
  5883. &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_date&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end_date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  5884.    &lt;span class=&quot;n&quot;&gt;current_timestamp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;calendar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timegm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5885.        &lt;span class=&quot;n&quot;&gt;current_date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timetuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
  5886.    &lt;span class=&quot;n&quot;&gt;sum_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data_points&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5887.        &lt;span class=&quot;n&quot;&gt;current_timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Sum&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5888.    &lt;span class=&quot;n&quot;&gt;metric_report&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  5889.        &lt;span class=&quot;s&quot;&gt;&apos;timestamp&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  5890.        &lt;span class=&quot;s&quot;&gt;&apos;value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;
  5891.            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;metric_name&lt;/span&gt;
  5892.            &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CAPACITY_METRICS&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sum_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  5893.    &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  5894.    &lt;span class=&quot;n&quot;&gt;current_date&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;datetime&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timedelta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minutes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5895.  
  5896. &lt;p&gt;Some metrics are averaged per minute while others are absolute values, it took some experimenting to figure out how to match them to what we see in the DynamoDB UI.
  5897. It’s also worth mentioning that we needed to fill the metrics that have a 0 value for consistency. There’s a chance that this is related to the issue AWS faced when decreasing the capacity for tables without consumption.
  5898. Finally, we run a simple check over the collected metrics to decide if we need to change the table capacity. Keep in mind this is a simplified example that operates only on read capacity for the table:&lt;/p&gt;
  5899.  
  5900. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;currently_provisioned&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_settings&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;ReadCapacityUnits&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  5901. &lt;span class=&quot;n&quot;&gt;throttling_metrics&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;usage_metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;ReadThrottleEvents&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  5902. &lt;span class=&quot;n&quot;&gt;average_throttling&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5903.    &lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;throttling_metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;
  5904.    &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;throttling_metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5905. &lt;span class=&quot;n&quot;&gt;consumed_metrics&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;usage_metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;ConsumedReadCapacityUnits&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  5906. &lt;span class=&quot;n&quot;&gt;average_consumed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  5907.    &lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;consumed_metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;
  5908.    &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;consumed_metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5909. &lt;span class=&quot;n&quot;&gt;utilization&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;average_consumed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currently_provisioned&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# This is the target utilization defined by the stock algorithm
  5910. &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;can_increase_capacity&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currently_provisioned&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MAX_READ_CAPACITY&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# We set a cap to keep the cost in check
  5911. &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;target_provisioning&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;average_consumed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TARGET_UTILIZATION&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Our target utilization is 0.8, higher than the original max value
  5912. &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;min_provisioning&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Arbitrary minimum
  5913. &lt;/span&gt;
  5914. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;average_throttling&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;can_increase_capacity&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# While throttling, we do quantized increases, with 2 fixed rates
  5915. &lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;new_capacity_settings&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currently_provisioned&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
  5916.        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HIGH_THROTTLING_CAPACITY_INCREASE&lt;/span&gt;
  5917.         &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;average_throttling&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THROTTLING_THRESHOLD&lt;/span&gt;
  5918.         &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LOW_THROTTLING_CAPACITY_INCREASE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  5919. &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currently_provisioned&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;min_provisioning&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;or&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;average_consumed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  5920.    &lt;span class=&quot;n&quot;&gt;new_capacity_settings&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;min_provisioning&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# If we are below min or nothing is using this table
  5921. &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;utilization&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TARGET_UTILIZATION&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;can_increase_capacity&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  5922.    &lt;span class=&quot;n&quot;&gt;new_capacity_settings&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target_provisioning&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# This means we are a bit tight, so we increase to match 0.8
  5923. &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;average_throttling&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;utilization&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BUFFERED_TARGET_UTILIZATION&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  5924.    &lt;span class=&quot;n&quot;&gt;new_capacity_settings&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target_provisioning&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# This means we are wasting some capacity, so we decrease it
  5925. &lt;/span&gt;
  5926. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_capacity_settings&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt;
  5927.    &lt;span class=&quot;n&quot;&gt;new_capacity_settings&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currently_provisioned&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  5928.    &lt;span class=&quot;c1&quot;&gt;# Proceed to update the table with the new settings
  5929. &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  5930.    &lt;span class=&quot;c1&quot;&gt;# Skip updating this table&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  5931.  
  5932. &lt;p&gt;While this code might be a little long to read, it’s pretty straightforward: it works exactly like the stock DynamoDB algorithm but it will increase the capacity aggressively by a fixed amount when there’s throttling involved.&lt;/p&gt;
  5933.  
  5934. &lt;p&gt;Please check this &lt;a href=&quot;https://gist.github.com/alvarotuso/589ea02dfa5823328e0f1e356563b8a1&quot;&gt;gist&lt;/a&gt; for the complete implementation, including write capacity and GSIs support!
  5935. Keep in mind that the fixed steps for throttling suited our needs, but a proportional approach could be used with any table workload. Another improvement would be to support multi region tables (right now, most of our tables are in one region).&lt;/p&gt;
  5936.  
  5937. &lt;p&gt;If you want to give it a try, you’ll have to create a &lt;a href=&quot;https://docs.aws.amazon.com/lambda/latest/dg/welcome.html&quot;&gt;lambda function&lt;/a&gt; with the code from the gist, and set up a &lt;a href=&quot;https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/Create-CloudWatch-Events-Scheduled-Rule.html&quot;&gt;cloudwatch rule&lt;/a&gt; to trigger the function periodically, every 5 minutes.&lt;/p&gt;
  5938.  
  5939. &lt;p&gt;You should make sure that the role you use in the lambda has the following permissions enabled:&lt;/p&gt;
  5940. &lt;ul&gt;
  5941.  &lt;li&gt;Read / Write permissions on the DynamoDB tables you want to test&lt;/li&gt;
  5942.  &lt;li&gt;Read permissions on Cloudwatch&lt;/li&gt;
  5943.  &lt;li&gt;Read permissions on Application Autoscaling&lt;/li&gt;
  5944.  &lt;li&gt;Read permissions on Resource Groups Tagging&lt;/li&gt;
  5945. &lt;/ul&gt;
  5946.  
  5947. &lt;p&gt;Once you’ve set everything up, the only missing piece is adding the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;autoscaling_manager_enabled=true&lt;/code&gt; tag to your table.
  5948. You should be able to monitor the execution of your lambda and the status of the DynamoDB table through the AWS console.&lt;/p&gt;
  5949.  
  5950. &lt;h2 id=&quot;about-on-demand-mode&quot;&gt;About On-Demand mode&lt;/h2&gt;
  5951.  
  5952. &lt;p&gt;On-Demand mode is a new addition to AWS that might help ease the pain of managing table capacity. The premise is that the user is now able to pay per request on unknown loads.
  5953. While this sounds like a very good solution to the same problem, the pricing scheme is different and there’s currently no way to reserve this capacity.&lt;/p&gt;
  5954.  
  5955. &lt;hr /&gt;
  5956.  
  5957. &lt;p&gt;&lt;strong&gt;Do you enjoy building high-quality large-scale systems? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  5958.  
  5959. </description>
  5960.    </item>
  5961.    
  5962.    
  5963.    
  5964.    <item>
  5965.      <title>Questioning Second-Price Auctions in Ad Tech</title>
  5966.      <link>https://tech.nextroll.com/blog/data-science/2018/12/21/second-price-auctions.html</link>
  5967.      <pubDate>Fri, 21 Dec 2018 00:00:00 -0800</pubDate>
  5968.      <author></author>
  5969.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data-science/2018/12/21/second-price-auctions</guid>
  5970.      <description>&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
  5971.  
  5972. &lt;p&gt;A primary responsibility of the data science team at the AdRoll Group is to determine what we should bid in real-time
  5973. auctions for ad inventory. Among other things, our bid price is dependent on the type of auction in which we are
  5974. participating.&lt;/p&gt;
  5975.  
  5976. &lt;p&gt;Second-price auctions, where the highest bidder wins and pays the second highest bid price, are popular in ad tech
  5977. because they incentivize bidders, such as AdRoll, to bid their true value for the ad impression being auctioned
  5978. &lt;sup&gt;&lt;a href=&quot;#footnotes&quot;&gt;(1)&lt;/a&gt;&lt;/sup&gt;. First-price auctions, where the highest bidder wins and pays their bid, are also popular,
  5979. but they incentivize bidders to reduce their bids below their valuation. Exchanges communicate the auction type to bidders,
  5980. like us, in the bid request. From the perspective of ad exchanges, there exists a (shortsighted) incentive to make bidders
  5981. believe they are bidding into second price auctions, so they bid their true valuation, while charging them something more
  5982. than the second price.&lt;/p&gt;
  5983.  
  5984. &lt;p&gt;In this post, we consider only auctions in which we are explicitly told the auction is second-price. We will examine how our
  5985. clearing prices compare to our bid prices in these second-price auctions, and we will observe some irregular patterns that
  5986. lead us to question whether these auctions really are second-price. We will then discuss the algorithmic defense against
  5987. misspecified auction mechanisms that we’ve developed at the AdRoll Group.&lt;/p&gt;
  5988.  
  5989. &lt;h1 id=&quot;questioning-auction-mechanisms&quot;&gt;Questioning Auction Mechanisms&lt;/h1&gt;
  5990. &lt;p&gt;In this section we will examine how our clearing prices compare to our bid prices in what are supposed to
  5991. be second-price auctions on four exchanges, each displaying a unique relationship between clearing prices and bid
  5992. prices. We will do this in two ways, and in both cases we will observe some questionable patterns. First, we will graphically
  5993. examine how clearing prices are related to the bid prices in second-price auctions. Second, we will compare the empirical
  5994. utility we gain from second-price auctions to theoretical calculations.&lt;/p&gt;
  5995.  
  5996. &lt;h2 id=&quot;graphical-analysis-of-second-price-auctions&quot;&gt;Graphical Analysis of Second-Price Auctions&lt;/h2&gt;
  5997. &lt;p&gt;In this subsection we will look at histograms of the ratios of our clearing price to our bid price, and scatter plots
  5998. of our clearing prices as a function of our bid prices.&lt;/p&gt;
  5999.  
  6000. &lt;h4 id=&quot;exchange-a&quot;&gt;Exchange A&lt;/h4&gt;
  6001. &lt;p&gt;We begin with an exchange that we believe runs true second-price auctions, for the most part. One reason we believe this is
  6002. that we regularly win impressions for a small fraction of our bid price, as you can see in the plots below. The only odd
  6003. feature in these plots is that we sometimes pay what we bid, but compared to other exchanges this effect is very mild.&lt;/p&gt;
  6004.  
  6005. &lt;table&gt;
  6006.  &lt;tbody&gt;
  6007.    &lt;tr&gt;
  6008.      &lt;td&gt;&lt;img src=&quot;/images/post_images/second_price_utility/hist_exchange_A.png&quot; alt=&quot;&quot; /&gt;&lt;/td&gt;
  6009.      &lt;td&gt;&lt;img src=&quot;/images/post_images/second_price_utility/scatter_exchange_A.png&quot; alt=&quot;&quot; /&gt;&lt;/td&gt;
  6010.    &lt;/tr&gt;
  6011.  &lt;/tbody&gt;
  6012. &lt;/table&gt;
  6013.  
  6014. &lt;h4 id=&quot;exchange-b&quot;&gt;Exchange B&lt;/h4&gt;
  6015. &lt;p&gt;Next, we analyze an exchange whose second-price auctions we have reason to question. On this exchange we almost never win
  6016. impressions for a small fraction of our bid price. Most of the time we end up paying exactly what we bid. In fact, these
  6017. auctions look more like first-price auctions than second-price auctions.&lt;/p&gt;
  6018.  
  6019. &lt;table&gt;
  6020.  &lt;tbody&gt;
  6021.    &lt;tr&gt;
  6022.      &lt;td&gt;&lt;img src=&quot;/images/post_images/second_price_utility/hist_exchange_B.png&quot; alt=&quot;&quot; /&gt;&lt;/td&gt;
  6023.      &lt;td&gt;&lt;img src=&quot;/images/post_images/second_price_utility/scatter_exchange_B.png&quot; alt=&quot;&quot; /&gt;&lt;/td&gt;
  6024.    &lt;/tr&gt;
  6025.  &lt;/tbody&gt;
  6026. &lt;/table&gt;
  6027.  
  6028. &lt;h3 id=&quot;exchange-c&quot;&gt;Exchange C&lt;/h3&gt;
  6029. &lt;p&gt;Here is another exchange whose second-price auctions are irregular &lt;sup&gt;&lt;a href=&quot;#footnotes&quot;&gt;(2)&lt;/a&gt;&lt;/sup&gt;. On this exchange, as
  6030. with the previous exchange, we see that we are often charged exactly what we bid. Moreover, we are often charged
  6031. roughly 90% of our bid price. How can the second-price in an auction so frequently be a function of our bid price?&lt;/p&gt;
  6032.  
  6033. &lt;table&gt;
  6034.  &lt;tbody&gt;
  6035.    &lt;tr&gt;
  6036.      &lt;td&gt;&lt;img src=&quot;/images/post_images/second_price_utility/hist_exchange_C.png&quot; alt=&quot;&quot; /&gt;&lt;/td&gt;
  6037.      &lt;td&gt;&lt;img src=&quot;/images/post_images/second_price_utility/scatter_exchange_C.png&quot; alt=&quot;&quot; /&gt;&lt;/td&gt;
  6038.    &lt;/tr&gt;
  6039.  &lt;/tbody&gt;
  6040. &lt;/table&gt;
  6041.  
  6042. &lt;h4 id=&quot;exchange-d&quot;&gt;Exchange D&lt;/h4&gt;
  6043. &lt;p&gt;This exchange does not exhibit the same strange patterns as “Exchange B” or “Exchange C”. Nonetheless, we almost never win
  6044. impressions for a small fraction of our bid price. Put another way, the bottom right corner of the scatter plot below is
  6045. conspicuously empty. Can this really arise from second price auctions?&lt;/p&gt;
  6046.  
  6047. &lt;table&gt;
  6048.  &lt;tbody&gt;
  6049.    &lt;tr&gt;
  6050.      &lt;td&gt;&lt;img src=&quot;/images/post_images/second_price_utility/hist_exchange_D.png&quot; alt=&quot;&quot; /&gt;&lt;/td&gt;
  6051.      &lt;td&gt;&lt;img src=&quot;/images/post_images/second_price_utility/scatter_exchange_D.png&quot; alt=&quot;&quot; /&gt;&lt;/td&gt;
  6052.    &lt;/tr&gt;
  6053.  &lt;/tbody&gt;
  6054. &lt;/table&gt;
  6055.  
  6056. &lt;h2 id=&quot;analysis-of-utility-in-second-price-auctions&quot;&gt;Analysis of Utility in Second-Price Auctions&lt;/h2&gt;
  6057. &lt;p&gt;In this subsection we will analyze second-price auctions by comparing the utility we realize empirically to the utility we
  6058. theoretically expect to realize. Assume we are in a second-price auction, and let \(M\) be a random variable representing
  6059. the highest bid among all other auction participants with PDF \(f_M\) and CDF \(F_M\). The fact that we are in a second-price
  6060. auction implies our bid \(b\) should be our valuation \(v\). We will win the auction when \(M &amp;lt; v\), and the utility we realize
  6061. when we win is \(v-M\). Putting this together, the expected utility in second-price auctions \(U_{SP}\) can be written as&lt;/p&gt;
  6062.  
  6063. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  6064. %&lt;![CDATA[
  6065. U_{SP} = E\left[\mathbb{1}_{\{M&lt;v\}}\left(v-M\right)\right]
  6066. %]]&gt;
  6067. &lt;/script&gt;
  6068.  
  6069. &lt;p&gt;Chris Evans, a Staff Data Science Engineer at the AdRoll Group, worked out a very nice result for this quantity. Following
  6070. the derivation of the tail-sum formula for expectation, we have &lt;sup&gt;&lt;a href=&quot;#footnotes&quot;&gt;(3)&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;
  6071.  
  6072. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  6073. %&lt;![CDATA[
  6074. \begin{align*}
  6075.  U_{SP}  &amp;= \int_0^v (v-m) \ f_M(m) \ dm &amp;&amp; \text{Definition of expectation.} \\
  6076.     &amp;= \int_0^v \left( \int_0^\infty \mathbb{1}_{\{(v-m)&gt;x\}}\ dx \right) f_M(m) \ dm &amp;&amp; \text{Rewrite with indicator.}\\
  6077.     &amp;= \int_0^\infty \left( \int_0^v \mathbb{1}_{\{(v-m)&gt;x\}}\ f_M(m) \ dm \right) dx &amp;&amp; \text{Exchange order of integration.}\\
  6078.     &amp;= \int_0^\infty F_M(v-x)\ dx &amp;&amp; \text{Definition of CDF.}\\
  6079.     &amp;= \int_0^v F_M(x)\ dx &amp;&amp; \text{Change of variables.}
  6080. \end{align*}
  6081. %]]&gt;
  6082. &lt;/script&gt;
  6083.  
  6084. &lt;p&gt;This result enables us to predict the utility we will realize in a second-price auction given \(F_M(x)\), information about
  6085. how the bids of other bidders are distributed. Fortunately, we had already developed an accurate model to predict \(F_M(x)\)
  6086. for other applications, as we will see in the next section. Now that we are able to calculate the predicted utility we can
  6087. compare it to the empirically realized utility. Below we can see a plot of the average utility empirically realized for
  6088. different (binned) values of predicted utility.&lt;/p&gt;
  6089.  
  6090. &lt;p&gt;&lt;img src=&quot;/images/post_images/second_price_utility/utility_in_second_price_auctions.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
  6091.  
  6092. &lt;p&gt;As this figure shows, this calculation predicts we will realize much more utility than we do for the three exchanges where
  6093. we saw irregular auction mechanisms (namely, B, C, and D). On the other hand, this calculation comes fairly close to correctly
  6094. predicting the utility we will realize in the case of “Exchange A”, where the auction mechanisms look truly second-price.
  6095. This plot is another reason to question the auction mechanisms as communicated by some exchanges, and further highlights our
  6096. need for an algorithmic response.&lt;/p&gt;
  6097.  
  6098. &lt;h1 id=&quot;responding-to-auction-mechanisms&quot;&gt;Responding to Auction Mechanisms&lt;/h1&gt;
  6099. &lt;p&gt;In this section we will discuss the algorithmic defenses we have built in response to misspecified auction mechanisms.
  6100. As background, we will first discuss bidding into auctions with well-specified auction mechanisms.&lt;/p&gt;
  6101.  
  6102. &lt;h2 id=&quot;bidding-into-specified-auction-mechanisms&quot;&gt;Bidding into Specified Auction Mechanisms&lt;/h2&gt;
  6103. &lt;p&gt;Ideally, we would know the exact auction mechanism into which we are bidding because it affects our optimal bid. With knowledge
  6104. of the true auction mechanism, we’d have an understanding of how the cost \(C\) of the impression varies as a function of our
  6105. bid. For example, in a first-price auction, where we pay what we bid, \(C(b) = b\). Given \(C\) we can compute the expected utility
  6106. for our bid as&lt;/p&gt;
  6107.  
  6108. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  6109. %&lt;![CDATA[
  6110. U(b) = E\left[ \mathbb{1}_{\{M&lt;b\}}\left(v - C(b) \right)\right] \\
  6111. %]]&gt;
  6112. &lt;/script&gt;
  6113.  
  6114. &lt;p&gt;Ryoma Canastra, a Data Science Engineer at the AdRoll Group, proved that to maximize the value we provide to customers,
  6115. each bid we submit should be chosen to maximize \(U\) &lt;sup&gt;&lt;a href=&quot;#footnotes&quot;&gt;(4)&lt;/a&gt;&lt;/sup&gt;. For example, in a first-price auction
  6116. our bid \(b_{FP}\) for an impression we valued at \(v\) dollars would be &lt;sup&gt;&lt;a href=&quot;#footnotes&quot;&gt;(5)&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
  6117.  
  6118. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  6119. %&lt;![CDATA[
  6120. \begin{align*}
  6121.  b_{FP} &amp;= \text{argmax}_{b} \ E\left[ \mathbb{1}_{\{M&lt;b\}}\left(v - b \right)\right] \\
  6122.         &amp;= \text{argmax}_{b} \ F_M(b) \left(v - b \right)
  6123. \end{align*}
  6124. %]]&gt;
  6125. &lt;/script&gt;
  6126.  
  6127. &lt;h2 id=&quot;bidding-into-misspecified-auction-mechanisms&quot;&gt;Bidding into Misspecified Auction Mechanisms&lt;/h2&gt;
  6128. &lt;p&gt;Without knowledge of the auction mechanism we cannot bid in the theoretically optimal way. Instead, we have to resort
  6129. to a more empirical approach. The approach we’ve settled on is quite simple. When we aren’t confident in the auction mechanism,
  6130. we predict it.&lt;/p&gt;
  6131.  
  6132. &lt;p&gt;Specifically, we predict \(p_{FP}\), the probability the auction will result in us paying something close to first-price.
  6133. To simplify things, we assume there are only first and second price auctions, and train a classifier to differentiate these
  6134. cases. The more confident we are in having to pay a large fraction of our bid price, the more we reduce our bid. In particular,
  6135. we bid&lt;/p&gt;
  6136.  
  6137. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  6138. %&lt;![CDATA[
  6139. \begin{align*}
  6140.  b &amp;= p_{FP} \ b_{FP} + (1 - p_{FP}) \ v
  6141. \end{align*}
  6142. %]]&gt;
  6143. &lt;/script&gt;
  6144.  
  6145. &lt;p&gt;This approach has a few nice properties. First, this approach reduces to optimal behavior when we are confident in the auction
  6146. mechanism. For example, when we are confident that we are in a first price auction (\(p_{FP} = 1\)) we will bid \(b_{FP}\), which
  6147. is optimal. Similarly, when we are confident that we are in a second-price auction (\(p_{FP} = 0\)) we will bid \(v\), which is
  6148. also optimal. Second, this approach learns and updates our bids as auction mechanisms change. Third, this approach is scalable,
  6149. requiring minimal human intervention. Most important, we like this approach because it has performed well in A/B tests, capturing
  6150. significantly more utility for us on behalf of our customers than submitting the optimal bid for the auction type passed by
  6151. exchanges.&lt;/p&gt;
  6152.  
  6153. &lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
  6154.  
  6155. &lt;p&gt;In the first section of this post we saw evidence that second-price auctions in ad tech may not always be purely second-price.
  6156. First, we saw irregularities in plots relating our clearing prices to our bid prices in what were supposed to be second-price
  6157. auctions. Next, we saw that the utility we realize in the questionable second-price auctions is much less than theory would
  6158. predict.&lt;/p&gt;
  6159.  
  6160. &lt;p&gt;That said, in some cases there may be harmless explanations for the observations of this blog post. It’s possible that header
  6161. bidding or different bid densities in the different exchanges could cause qualitatively different clearing price vs. bid price
  6162. behavior. Or, perhaps we are miscommunicating with certain exchanges about the type of auctions in which we are participating.&lt;/p&gt;
  6163.  
  6164. &lt;p&gt;Nonetheless, we have decided to defend against the possibility of misspecified auction mechanisms. In the second section of
  6165. this post we saw an approach for algorithmically defending against opaque auction mechanisms, one of the many ways the
  6166. data science team is working to deliver maximal value for our customers.&lt;/p&gt;
  6167.  
  6168. &lt;h1 id=&quot;-footnotes&quot;&gt;&lt;a name=&quot;footnotes&quot;&gt;&lt;/a&gt; Footnotes&lt;/h1&gt;
  6169. &lt;ol&gt;
  6170.  &lt;li&gt;The &lt;a href=&quot;https://en.wikipedia.org/wiki/Vickrey_auction&quot;&gt;Vickrey auction&lt;/a&gt; Wikipedia page explains why bidding your valuation
  6171. is optimal in a second-price auction.&lt;/li&gt;
  6172.  &lt;li&gt;We only consider a subset of second-price auctions for this exchange. These are all private marketplace deals understood to
  6173. be second-price auctions.&lt;/li&gt;
  6174.  &lt;li&gt;Section 1.2.1 of &lt;a href=&quot;https://inst.eecs.berkeley.edu/~cs70/su16/static/su16/extra_note/sinho_cs_70_notes.pdf&quot;&gt;these course notes&lt;/a&gt;
  6175. has a proof of the discrete tail-sum formula.&lt;/li&gt;
  6176.  &lt;li&gt;The details are beyond the scope of this post.&lt;/li&gt;
  6177.  &lt;li&gt;This was the original reason we developed \(F_M(x)\), which we used in the first section to predict the utility realized
  6178. in a second-price auction.&lt;/li&gt;
  6179. &lt;/ol&gt;
  6180.  
  6181. &lt;hr /&gt;
  6182. </description>
  6183.    </item>
  6184.    
  6185.    
  6186.    
  6187.    <item>
  6188.      <title>
  6189. $ marks the spot: saving a ton with EC2 Spot Fleet
  6190. </title>
  6191.      <link>https://tech.nextroll.com/blog/dev/ops/2018/10/15/x-marks-the-spot.html</link>
  6192.      <pubDate>Mon, 15 Oct 2018 00:00:00 -0700</pubDate>
  6193.      <author></author>
  6194.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/ops/2018/10/15/x-marks-the-spot</guid>
  6195.      <description>&lt;style type=&quot;text/css&quot;&gt;/*&lt;![CDATA[*/
  6196.  
  6197. /*]]&gt;*/&lt;/style&gt;
  6198.  
  6199. &lt;p&gt;We reduced by over 60% the EC2 instance cost of our 1000+ node globally-distributed log-producing application by migrating it entirely to EC2 Spot Fleet.  We discuss some of the issues we faced and also present the log data recovery mechanism which helped enable this migration.&lt;/p&gt;
  6200.  
  6201. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;15-20 minute read&lt;/code&gt;&lt;/p&gt;
  6202.  
  6203. &lt;hr /&gt;
  6204.  
  6205. &lt;h1 id=&quot;background&quot;&gt;Background&lt;/h1&gt;
  6206.  
  6207. &lt;p&gt;AdRoll operates a globally-distributed &lt;a href=&quot;/blog/dev/2018/01/08/quaff-that-potion-saving-millions-with-elixir-and-erlang.html#real-time-bidding&quot;&gt;real-time bidding&lt;/a&gt;
  6208.  platform running on &lt;a href=&quot;https://aws.amazon.com/ec2/&quot;&gt;Amazon EC2&lt;/a&gt;.  This platform has historically used &lt;a href=&quot;https://aws.amazon.com/ec2/autoscaling/&quot;&gt;EC2
  6209.  Auto Scaling&lt;/a&gt; in order to scale a fleet of &lt;a href=&quot;http://www.erlang.org/&quot;&gt;Erlang/OTP&lt;/a&gt;
  6210.  application nodes according to load, typically exceeding 1000+ nodes (16000+ vCPUs)
  6211.  during times of peak volume.  This article describes how we substantially reduced our
  6212.  EC2 costs by migrating this application entirely to &lt;a href=&quot;https://aws.amazon.com/blogs/aws/amazon-ec2-spot-fleet-api-manage-thousands-of-instances-with-one-request/&quot;&gt;EC2 Spot
  6213.  Fleet&lt;/a&gt;.&lt;/p&gt;
  6214.  
  6215. &lt;h2 id=&quot;motivation&quot;&gt;Motivation&lt;/h2&gt;
  6216.  
  6217. &lt;p&gt;A focus of the company has always been operational efficiency; money not spent is money
  6218.  earned, and reducing operating costs by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$X&lt;/code&gt; is equivalent to earning &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$X / margin&lt;/code&gt; in
  6219.  additional revenue (plus the &lt;a href=&quot;https://en.wikipedia.org/wiki/Time_value_of_money&quot;&gt;time value of money&lt;/a&gt; for
  6220.  avoided outlays).&lt;/p&gt;
  6221.  
  6222. &lt;h3 id=&quot;reserved-instances&quot;&gt;Reserved Instances&lt;/h3&gt;
  6223.  
  6224. &lt;p&gt;Historically, reducing operating costs for this application has been achieved by
  6225.  purchasing &lt;a href=&quot;https://aws.amazon.com/ec2/pricing/reserved-instances/&quot;&gt;Reserved Instances&lt;/a&gt;: for a partial up-front
  6226.  reservation, in exchange for an up-front fee per instance an hourly fee is billed per
  6227.  instance-hour regardless of usage.  For what had until now been our application’s
  6228.  primary instance type, this would have worked out to savings of about 40% relative to
  6229.  the on-demand price.&lt;/p&gt;
  6230.  
  6231. &lt;p&gt;A problem with this approach becomes apparent when considering autoscaling behavior:&lt;/p&gt;
  6232.  
  6233. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/rtb-load-fluctuation.png&quot; alt=&quot;RTB application load fluctuation image&quot; /&gt;&lt;/p&gt;
  6234.  
  6235. &lt;p&gt;This image represents the instance count of our real-time bidding application over about
  6236.  a week-long period of time: the sinusoidal red line corresponds to instances running,
  6237.  and the horizontal green line corresponds to the median instance count over the larger
  6238.  window of time containing this week-long window.&lt;/p&gt;
  6239.  
  6240. &lt;p&gt;How many instances should we reserve?  Assuming future load will be similar to
  6241.  historical load, we could simply reserve the median multiplied by the expected growth
  6242.  factor; about half the time we’d have more instances than actually needed, and otherwise
  6243.  we’d not have enough and would need to purchase on-demand capacity.  This would give a
  6244.  reasonable amount of savings, although reserving the median count isn’t necessarily
  6245.  optimal (and isn’t in AdRoll’s case; we use a different strategy which is outside the
  6246.  scope of this article).&lt;/p&gt;
  6247.  
  6248. &lt;p&gt;If our future usage prediction turns out to be non-optimal, we’ll either wind up having
  6249.  spent too much on reservations, or too much on on-demand capacity.  Making reservations
  6250.  for 1-year terms amplifies the cost of any misprediction since we can’t modify them
  6251.  after purchasing (and the &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ri-market-general.html&quot;&gt;reserved instance marketplace&lt;/a&gt; is not
  6252.  viable in our case).&lt;/p&gt;
  6253.  
  6254. &lt;h3 id=&quot;spot-instances&quot;&gt;Spot instances&lt;/h3&gt;
  6255.  
  6256. &lt;p&gt;&lt;a href=&quot;https://aws.amazon.com/ec2/spot/&quot;&gt;EC2 Spot Instances&lt;/a&gt; present a relevant and compelling savings narrative:
  6257.  bid on market-rate capacity in exchange for an often substantial discount:&lt;/p&gt;
  6258.  
  6259. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/spot-market-2.png&quot; alt=&quot;EC2 spot market history (c3.4xlarge instance type in us-east-1
  6260.  region)&quot; /&gt;&lt;/p&gt;
  6261.  
  6262. &lt;p&gt;If an application can be made to use spot instances, substantial savings can ensue.  In
  6263.  the above image, in mid-2018 the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;c3.4xlarge&lt;/code&gt; type in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;us-east-1&lt;/code&gt; had a median spot
  6264.  market price of about &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$0.29/hr&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$0.26/hr&lt;/code&gt; excluding outliers) – &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;34%&lt;/code&gt; of the
  6265.  on-demand price, and about &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;57%&lt;/code&gt; of the effective reserved price of a partial up-front
  6266.  reservation!&lt;/p&gt;
  6267.  
  6268. &lt;p&gt;Unfortunately, these massive savings come with a massive caveat: a spot instance may be
  6269.  evicted at any time based on changes in market demand (i.e., when another customer has
  6270.  placed a higher bid during a period of high demand and low capacity in the same
  6271.  availability zone).&lt;/p&gt;
  6272.  
  6273. &lt;p&gt;This eviction behavior relates to two problems historically preventing us from using
  6274.  spot instances in our application: loss of precious log data and the likelihood of
  6275.  occasional market capacity exhaustion.&lt;/p&gt;
  6276.  
  6277. &lt;blockquote&gt;
  6278.  &lt;p&gt;Author’s note:&lt;/p&gt;
  6279.  
  6280.  &lt;p&gt;At the time this section was written, AWS had already changed its spot market behavior
  6281. by making evictions randomly instead of based on bid price during periods of decreased
  6282. instance availability.  This means a high spot market bid no longer confers a higher
  6283. chance of avoiding eviction.  Our application is unaffected by this.&lt;/p&gt;
  6284. &lt;/blockquote&gt;
  6285.  
  6286. &lt;h4 id=&quot;problem-precious-log-data&quot;&gt;Problem: precious log data&lt;/h4&gt;
  6287.  
  6288. &lt;p&gt;Our real-time bidding application produces a large volume of log data on a daily basis:
  6289.  about 40TB (compressed) of logs, market data, and data used to bill customers, all
  6290.  written to local storage and periodically uploaded to S3.&lt;/p&gt;
  6291.  
  6292. &lt;p&gt;Typical refrains here include “live with occasional data loss” or “don’t store valuable
  6293.  data on a spot instance”.  These are unacceptable strategies for our application; we
  6294.  can’t afford to routinely lose portions of this dataset (and don’t want to introduce new
  6295.  complexity, failure modes, or cost such as by adopting &lt;a href=&quot;https://aws.amazon.com/kinesis/data-firehose/&quot;&gt;Kinesis Firehose&lt;/a&gt;
  6296.  for log emission), so we can’t use spot instances without a reasonable guarantee that
  6297.  logs won’t be lost if an instance is evicted on short (or no) notice.&lt;/p&gt;
  6298.  
  6299. &lt;h4 id=&quot;problem-market-fluctuations-and-capacity-exhaustion&quot;&gt;Problem: market fluctuations and capacity exhaustion&lt;/h4&gt;
  6300.  
  6301. &lt;p&gt;A second issue arises from the nature of the spot market: it’s possible for the market
  6302.  price of an instance type to rise far beyond historical levels, exceeding the effective
  6303.  reserved price and meeting (and at least historically exceeding) the on-demand price in
  6304.  that availability zone (and possibly all zones in a region).  It’s also possible for
  6305.  capacity of an instance type to be completely exhausted in an availability zone,
  6306.  preventing an application from running if dependent on a specific zone or instance type.&lt;/p&gt;
  6307.  
  6308. &lt;p&gt;A customer in such a situation experiences all of the drawbacks of spot usage with none
  6309.  of the benefits.  We can’t use spot instances absent resistance to market price
  6310.  fluctuations and capacity exhaustion.&lt;/p&gt;
  6311.  
  6312. &lt;h2 id=&quot;a-viable-building-block-ec2-spot-fleet&quot;&gt;A viable building block: EC2 Spot Fleet&lt;/h2&gt;
  6313.  
  6314. &lt;p&gt;&lt;a href=&quot;https://aws.amazon.com/blogs/aws/amazon-ec2-spot-fleet-api-manage-thousands-of-instances-with-one-request/&quot;&gt;EC2 Spot Fleet&lt;/a&gt; appears to meet most of our requirements while
  6315.  granting access to spot market savings.  A spot fleet allows a user to deploy spot
  6316.  instances without having to manage individual spot requests while also allowing use of
  6317.  multiple different instance types at the same time!  In particular:&lt;/p&gt;
  6318.  
  6319. &lt;ol&gt;
  6320.  &lt;li&gt;
  6321.    &lt;p&gt;It supports &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet-automatic-scaling.html&quot;&gt;autoscaling&lt;/a&gt;.  As a result, our
  6322. application could continue to scale appropriately according to load.&lt;/p&gt;
  6323.  &lt;/li&gt;
  6324.  &lt;li&gt;
  6325.    &lt;p&gt;It supports &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html#spot-instance-weighting&quot;&gt;multiple instance types&lt;/a&gt; in the same logical
  6326. grouping (unlike autoscaling groups), with different spot bid prices for each type,
  6327. and different allocation strategies such as “lowest price” and “diversified”.  As a
  6328. result, our application could gain resistance to market fluctuations and insufficient
  6329. market capacity potentially affecting multiple instance types and availability zones.&lt;/p&gt;
  6330.  &lt;/li&gt;
  6331.  &lt;li&gt;
  6332.    &lt;p&gt;It supports both &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html#on-demand-in-spot&quot;&gt;on-demand (reserved) and spot&lt;/a&gt;
  6333. instances in the same fleet.  As a result, we have the ability to migrate our
  6334. existing deployment from autoscaling groups entirely to spot fleet, simplifying
  6335. operations.&lt;/p&gt;
  6336.  &lt;/li&gt;
  6337. &lt;/ol&gt;
  6338.  
  6339. &lt;p&gt;Most of the time a spot instance will be given a &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html#spot-instance-termination-notices&quot;&gt;two-minute
  6340.  warning&lt;/a&gt; before eviction; while the timing and even
  6341.  delivery of that notice isn’t guaranteed, it at least gives us an opportunity to
  6342.  gracefully stop an instance about to be evicted (though possibly without enough time to
  6343.  upload all of its log data, which we address in the next section).&lt;/p&gt;
  6344.  
  6345. &lt;p&gt;Considering the above, it looks like we could deploy our application to EC2 Spot Fleet
  6346.  and treat it much the same way as we treat autoscaling groups, as long as we handle the
  6347.  case of eviction without having enough time to upload logs.&lt;/p&gt;
  6348.  
  6349. &lt;h1 id=&quot;implementation&quot;&gt;Implementation&lt;/h1&gt;
  6350.  
  6351. &lt;p&gt;The devil is in the details…&lt;/p&gt;
  6352.  
  6353. &lt;h2 id=&quot;ec2-spot-fleets-are-not-entirely-unlike-auto-scaling-groups&quot;&gt;EC2 Spot Fleets are not entirely unlike Auto Scaling Groups&lt;/h2&gt;
  6354.  
  6355. &lt;p&gt;Up to now, we’ve been using a robust and featureful product to run our application:
  6356.  &lt;a href=&quot;https://aws.amazon.com/ec2/autoscaling/&quot;&gt;Auto Scaling Groups&lt;/a&gt;.  ASGs include support for instance monitoring
  6357.  and health checking, various means of scaling according to load, instance lifecycle
  6358.  management, upgrading in-place, and discoverability &amp;amp; management via naming and tagging
  6359.  of resources.&lt;/p&gt;
  6360.  
  6361. &lt;p&gt;Unfortunately, spot fleets lack feature parity with ASGs, including almost all of the
  6362.  above.  Because we wanted to migrate our existing application with as few externally- or
  6363.  application-visible changes as possible, we wound up implementing our own solutions or
  6364.  finding alternatives for all of the following in order to present a compatible
  6365.  environment for running and managing our application.&lt;/p&gt;
  6366.  
  6367. &lt;h3 id=&quot;name-tags&quot;&gt;Name tags&lt;/h3&gt;
  6368.  
  6369. &lt;h4 id=&quot;problem&quot;&gt;Problem&lt;/h4&gt;
  6370.  
  6371. &lt;p&gt;Unlike ASGs, spot fleets can’t be named or tagged.  Instead, the user is presented with
  6372.  opaque UUIDs for created spot fleets and this user-unfriendly console for managing them:&lt;/p&gt;
  6373.  
  6374. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/spot-fleet-console.png&quot; alt=&quot;EC2 spot fleet console&quot; /&gt;&lt;/p&gt;
  6375.  
  6376. &lt;h4 id=&quot;solution&quot;&gt;Solution&lt;/h4&gt;
  6377.  
  6378. &lt;p&gt;This doesn’t fit into our existing workflow, which allows custom scripts (and tools like
  6379.  &lt;a href=&quot;https://github.com/aws/aws-cli&quot;&gt;awscli&lt;/a&gt;) to locate and manage resources using names and tags.  The spot
  6380.  fleet console is also effectively useless for locating resources of interest unless a
  6381.  request ID is already known.&lt;/p&gt;
  6382.  
  6383. &lt;p&gt;We resolved this by introducing a small DynamoDB table mapping resource names to
  6384.  attributes (which we would otherwise have set as tags) and spot fleet request ids, and
  6385.  updating our tools and workflow to reference the table.&lt;/p&gt;
  6386.  
  6387. &lt;h3 id=&quot;lifecycle-hooks&quot;&gt;Lifecycle hooks&lt;/h3&gt;
  6388.  
  6389. &lt;h4 id=&quot;problem-1&quot;&gt;Problem&lt;/h4&gt;
  6390.  
  6391. &lt;p&gt;Unlike ASGs, spot fleets don’t support &lt;a href=&quot;https://docs.aws.amazon.com/autoscaling/ec2/userguide/lifecycle-hooks.html&quot;&gt;lifecycle hooks&lt;/a&gt;.  This
  6392.  feature allows instances to be kept in pending states during startup and before
  6393.  termination, allowing for example external resources required for operation to be
  6394.  created before an application starts and to be cleaned up after it exits, and for work
  6395.  currently in progress to be finished before termination (e.g., log uploads or active
  6396.  data processing):&lt;/p&gt;
  6397.  
  6398. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/lifecycle_hooks.png&quot; alt=&quot;EC2 lifecycle hooks flow chart&quot; /&gt;&lt;/p&gt;
  6399.  
  6400. &lt;h4 id=&quot;solution-1&quot;&gt;Solution&lt;/h4&gt;
  6401.  
  6402. &lt;p&gt;This is a feature which could perhaps be approximated for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Pending:*&lt;/code&gt; states by an
  6403.  agent running on each host and coordinating with an external resource management
  6404.  component.  There is no substitute for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Terminating:*&lt;/code&gt; states (e.g., allowing an
  6405.  instance to upload its log data before proceeding with termination).&lt;/p&gt;
  6406.  
  6407. &lt;p&gt;We resorted to simply abandoning our (limited) use of lifecycle hooks and moving the
  6408.  associated responsibilities into the application’s instance setup process (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Pending:*&lt;/code&gt;
  6409.  states) and introducing a log recovery system as described later (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Terminating:*&lt;/code&gt;
  6410.  states).&lt;/p&gt;
  6411.  
  6412. &lt;h3 id=&quot;health-checks&quot;&gt;Health checks&lt;/h3&gt;
  6413.  
  6414. &lt;h4 id=&quot;problem-2&quot;&gt;Problem&lt;/h4&gt;
  6415.  
  6416. &lt;p&gt;An instance in an autoscaling group can be either in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;healthy&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unhealthy&lt;/code&gt; state.
  6417.  An &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unhealthy&lt;/code&gt; instance is terminated by its ASG and replaced, and an end user may set
  6418.  this state (e.g., from a user-operated health checker).  ELBs also &lt;a href=&quot;https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-add-elb-healthcheck.html&quot;&gt;nicely
  6419.  integrate&lt;/a&gt; with this feature: if an application exposes a
  6420.  health endpoint (such as an HTTP URL), an attached ELB will set the instance as
  6421.  &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unhealthy&lt;/code&gt; if that endpoint fails to return a successful response.&lt;/p&gt;
  6422.  
  6423. &lt;p&gt;Unfortunately, spot fleet offers no user-controllable health status; an instance will
  6424.  only be replaced if it fails the basic EC2 health check which has no knowledge of an
  6425.  application’s health.  As ELBs don’t control the basic health check result, the ELB
  6426.  health check feature does not work with spot fleet instances.&lt;/p&gt;
  6427.  
  6428. &lt;h4 id=&quot;solution-2&quot;&gt;Solution&lt;/h4&gt;
  6429.  
  6430. &lt;p&gt;We implemented our own health checking component using the versatile &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;awscli&lt;/code&gt; tool: we
  6431.  run a script in a loop on a utility instance which obtains the health check
  6432.  configuration from each ELB (which would affect ASG instances but not fleet instances)
  6433.  and apply it ourselves to each attached spot fleet instance (obtained from the
  6434.  naming/tagging table); fleet instances older than the grace period which fail the number
  6435.  of checks configured in the ELB’s health check are simply terminated (as there is no
  6436.  user-controllable health state).&lt;/p&gt;
  6437.  
  6438. &lt;h3 id=&quot;no-in-place-upgrading&quot;&gt;No in-place upgrading&lt;/h3&gt;
  6439.  
  6440. &lt;h4 id=&quot;problem-3&quot;&gt;Problem&lt;/h4&gt;
  6441.  
  6442. &lt;p&gt;Spot fleets use &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-launch-templates.html&quot;&gt;launch templates&lt;/a&gt;, a generalization of the
  6443.  &lt;a href=&quot;https://docs.aws.amazon.com/autoscaling/ec2/userguide/LaunchConfiguration.html&quot;&gt;launch configuration&lt;/a&gt; concept used in ASGs.  An ASG includes a
  6444.  reference to its current launch configuration, which can be changed by the user without
  6445.  recreating the ASG.  After changing the ASG’s active launch configuration, instances
  6446.  newly launched in the ASG adopt that new launch configuration; this allows for example
  6447.  the userdata of instances in an ASG to be updated, instance types to be changed, block
  6448.  device mappings to be updated, and so on.&lt;/p&gt;
  6449.  
  6450. &lt;p&gt;Unfortunately, spot fleet has no equivalent behavior: a launch template is specified
  6451.  once at spot fleet request time and can’t later be changed; nor can the launch template
  6452.  version be changed for an existing spot fleet request (launch templates introduced a
  6453.  “version” concept: a launch template may have multiple versions and a “default” version,
  6454.  but for a spot fleet the version must be fixed in the creation request).&lt;/p&gt;
  6455.  
  6456. &lt;h4 id=&quot;solution-3&quot;&gt;Solution&lt;/h4&gt;
  6457.  
  6458. &lt;p&gt;None; we must recreate our spot fleets whenever changing launch template parameters such
  6459.  as userdata, bid prices, or instance type weights and overrides.&lt;/p&gt;
  6460.  
  6461. &lt;h3 id=&quot;unexpected-scaling-behavior-stuck-fleets&quot;&gt;Unexpected scaling behavior: stuck fleets&lt;/h3&gt;
  6462.  
  6463. &lt;h4 id=&quot;problem-4&quot;&gt;Problem&lt;/h4&gt;
  6464.  
  6465. &lt;p&gt;We use &lt;a href=&quot;https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-simple-step.html&quot;&gt;step scaling&lt;/a&gt; to scale our application according to load.
  6466.  This works well; each of our application instances can report its load status using a
  6467.  simple mechanism, and a scaling action can be taken according to the severity of the
  6468.  aggregate value.&lt;/p&gt;
  6469.  
  6470. &lt;p&gt;In this example image, a deployment could consist of four nodes: two reporting
  6471.  “overload” and two reporting “stable”.  In aggregate, we can say that our application is
  6472.  50% overloaded and take an appropriate action (say, increasing our capacity by 25%):&lt;/p&gt;
  6473.  
  6474. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/overload.png&quot; alt=&quot;load reporting mechanism&quot; /&gt;&lt;/p&gt;
  6475.  
  6476. &lt;p&gt;Unfortunately, this can interact poorly with how spot fleet scaling works: there are
  6477.  circumstances in which a user fleet can indicate scaling is required, &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet-automatic-scaling.html&quot;&gt;but for which the
  6478.  spot fleet scaling service will take no action&lt;/a&gt; (visible in
  6479.  rounding behavior and scaling against fulfilled capacity instead of target capacity).&lt;/p&gt;
  6480.  
  6481. &lt;p&gt;This is bad for our application: if we know a scaling action is required, we must always
  6482.  then execute such an action (or risk being unable to serve traffic and thus not earning
  6483.  money, or spending too much money due to having an excessive number of hosts).&lt;/p&gt;
  6484.  
  6485. &lt;h4 id=&quot;solution-4&quot;&gt;Solution&lt;/h4&gt;
  6486.  
  6487. &lt;p&gt;We opted for a solution not requiring code or instrumentation changes in our
  6488.  application: a “fleet kicker”.&lt;/p&gt;
  6489.  
  6490. &lt;p&gt;This component (a shell script executing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;awscli&lt;/code&gt; commands) runs in a loop on a utility
  6491.  instance and monitors the cloudwatch alarm state of the scaling policies associated with
  6492.  each of our fleets.  If any alarm remains in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ALARM&lt;/code&gt; state for too long, the
  6493.  component adjusts the target capacity according to the associated scaling policy and
  6494.  scalable target definition.&lt;/p&gt;
  6495.  
  6496. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/fleet-kicker.png&quot; alt=&quot;stuck fleet kicker effect&quot; /&gt;&lt;/p&gt;
  6497.  
  6498. &lt;p&gt;This image shows the cloudwatch metric value (multiplied by 100 to be compatible with
  6499.  the spot fleet UI in the AWS console) for one of our application’s fleets (negative
  6500.  values represent “underload”, or having too many instances relative to traffic volume).
  6501.  The first arrow indicates the beginning of a period in which our fleet should have
  6502.  scaled down, but did not.  The second arrow indicates the point at which our “fleet
  6503.  kicker” detected the stuck fleet and activated, forcing a decrement to the fleet’s
  6504.  target capacity.&lt;/p&gt;
  6505.  
  6506. &lt;p&gt;With this component, our fleets avoid becoming stuck and can scale up and down as we’d
  6507.  expect them to, and we can treat them conceptually as if they were ASGs.&lt;/p&gt;
  6508.  
  6509. &lt;p&gt;[Note: AWS has indicated a possible alternate solution being switching to &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet-target-tracking.html&quot;&gt;target
  6510.  tracking&lt;/a&gt; scaling policies, which could be useful if compatible
  6511.  with an application.  We have not yet validated use of target tracking policies with our
  6512.  application.]&lt;/p&gt;
  6513.  
  6514. &lt;h2 id=&quot;spot-savior-a-log-data-recovery-system&quot;&gt;Spot savior: a log data recovery system&lt;/h2&gt;
  6515.  
  6516. &lt;p&gt;We have so far described our solutions to problems arising from attempting to treat spot
  6517.  fleets conceptually the same way as autoscaling groups.  Our application also now makes
  6518.  use of many instance types in order to tolerate market fluctuations and makes
  6519.  appropriate bids for each type thanks to &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html#spot-instance-weighting&quot;&gt;spot fleet’s override and weighting
  6520.  mechanic&lt;/a&gt;.  There’s one critical piece left: what to do
  6521.  about inevitable spot market evictions.&lt;/p&gt;
  6522.  
  6523. &lt;h3 id=&quot;much-ado-about-eviction&quot;&gt;Much ado about eviction&lt;/h3&gt;
  6524.  
  6525. &lt;p&gt;Our application stages log data to a volume attached to each instance.  We periodically
  6526.  convey this data to S3 during normal operation, and until now also at instance
  6527.  termination time (previously relying on lifecycle hooks for ASG instances, which have no
  6528.  spot fleet analog):&lt;/p&gt;
  6529.  
  6530. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/app-instance-store.png&quot; alt=&quot;instance using ephemeral instance store volumes&quot; /&gt;&lt;/p&gt;
  6531.  
  6532. &lt;p&gt;To make a long story short, we can’t guarantee that the two-minute eviction warning will
  6533.  be sufficient for our application when running on a spot instance:&lt;/p&gt;
  6534.  
  6535. &lt;ul&gt;
  6536.  &lt;li&gt;the warning might not actually be delivered with two minutes to spare, or at all;&lt;/li&gt;
  6537.  &lt;li&gt;the warning could be misinterpreted due to a bug;&lt;/li&gt;
  6538.  &lt;li&gt;the instance might have staged a disproportionate amount of log data which can’t all
  6539. be uploaded to S3 in two minutes;&lt;/li&gt;
  6540.  &lt;li&gt;adverse network or service conditions may exist, preventing timely or successful
  6541. uploading to S3;&lt;/li&gt;
  6542.  &lt;li&gt;the application might fail to gracefully stop in a timely manner, leaving insufficient
  6543. time for log uploading;&lt;/li&gt;
  6544. &lt;/ul&gt;
  6545.  
  6546. &lt;p&gt;and so on.&lt;/p&gt;
  6547.  
  6548. &lt;p&gt;Having staged our log data, the only guarantee we can make is: our writes were
  6549.  acknowledged by a filesystem on some volume.  What to do?&lt;/p&gt;
  6550.  
  6551. &lt;h3 id=&quot;ebs-to-the-rescue&quot;&gt;EBS to the rescue!&lt;/h3&gt;
  6552.  
  6553. &lt;p&gt;It turns out that &lt;a href=&quot;https://aws.amazon.com/ebs/&quot;&gt;Amazon EBS&lt;/a&gt;, a foundational network block storage
  6554.  product, allows block storage volumes to survive beyond the termination of the host to
  6555.  which they are attached.&lt;/p&gt;
  6556.  
  6557. &lt;p&gt;During normal operation, we can treat an EBS volume largely the same way as an ephemeral
  6558.  instance store volume and stage our (append-only) logs, periodically uploading as usual:&lt;/p&gt;
  6559.  
  6560. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/app-ebs.png&quot; alt=&quot;instance using EBS volumes&quot; /&gt;&lt;/p&gt;
  6561.  
  6562. &lt;p&gt;But we can also &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/terminating-instances.html#preserving-volumes-on-termination&quot;&gt;configure the volume to survive
  6563.  independently&lt;/a&gt; as an orphaned volume if its instance is
  6564.  terminated:&lt;/p&gt;
  6565.  
  6566. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/app-orphaned-ebs.png&quot; alt=&quot;orphaned EBS volume&quot; /&gt;&lt;/p&gt;
  6567.  
  6568. &lt;p&gt;This gives us an opportunity to recover logs from evicted spot instances without having
  6569.  to worry about whether the two-minute warning was enough (or having to introduce a new
  6570.  application logging mechanism).  We created a generic log recovery pipeline external to
  6571.  our application to do just that.&lt;/p&gt;
  6572.  
  6573. &lt;p&gt;To recover logs from a volume, we need to execute a sequence of steps; each with a set
  6574.  of pre- and post-conditions, an implementation, and a position relative to other steps.
  6575.  For example: in order to copy a log file from a volume, we need to have first mounted
  6576.  the volume; to have mounted a volume, we need it (or a copy of it) to exist in the same
  6577.  availability zone as the current instance; to make a copy of a volume, we need to have
  6578.  created a snapshot of it (possibly using one which already exists); and so on.&lt;/p&gt;
  6579.  
  6580. &lt;p&gt;There’s a ready-made tool for clearly expressing solutions to this kind of problem:
  6581.  &lt;a href=&quot;https://luigi.readthedocs.io/&quot;&gt;Luigi&lt;/a&gt;.&lt;/p&gt;
  6582.  
  6583. &lt;h3 id=&quot;luigi-for-connecting-a-series-of-tubes&quot;&gt;Luigi: for connecting a series of tubes…&lt;/h3&gt;
  6584.  
  6585. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/luigi-logo.png&quot; alt=&quot;luigi logo&quot; /&gt;&lt;/p&gt;
  6586.  
  6587. &lt;p&gt;Luigi has been featured several times on AdRoll’s tech
  6588.  blog&lt;a href=&quot;/blog/data/2015/09/22/data-pipelines-docker.html&quot;&gt;[1]&lt;/a&gt;&lt;a href=&quot;/blog/data/2015/10/15/luigi.html&quot;&gt;[2]&lt;/a&gt;&lt;a href=&quot;/blog/data/2018/08/08/running-jobs-with-aws-batch.html&quot;&gt;[3]&lt;/a&gt;: it’s a useful tool for
  6589.  orchestrating jobs consisting of large, dynamic, and explicit hierarchies of
  6590.  parameterized tasks which can depend on other tasks, any of which may fail (making any
  6591.  dependent tasks also fail).  Even though much smaller than a typical Luigi job, this is
  6592.  a good fit for our log recovery scenario:&lt;/p&gt;
  6593.  
  6594. &lt;ul&gt;
  6595.  &lt;li&gt;recovering logs from an EBS volume is a batch process and insensitive to small delays
  6596. and retries occurring over a period of several minutes;&lt;/li&gt;
  6597.  &lt;li&gt;we need to recover an unknown number of logs from each volume;&lt;/li&gt;
  6598.  &lt;li&gt;at most a single worker should be attempting recovery of a volume at any time;&lt;/li&gt;
  6599.  &lt;li&gt;jobs are repeatable and may be incremental: each log file of interest on a volume has
  6600. a 1-1 correspondence with an object on S3 (thus, we could dynamically create a Luigi
  6601. task per log file and retry a failing overall job repeatedly until success);&lt;/li&gt;
  6602.  &lt;li&gt;each recovery job has many steps, any of which may fail due to external reasons (e.g.,
  6603. service outages, API throttling/failures in excess of retries, unexpected worker host
  6604. termination, etc).&lt;/li&gt;
  6605. &lt;/ul&gt;
  6606.  
  6607. &lt;p&gt;[Aside: &lt;a href=&quot;https://aws.amazon.com/step-functions/&quot;&gt;AWS Step Functions&lt;/a&gt; could also be a good fit for this
  6608.  scenario if a step function Lambda task could mount and access the contents of an EBS
  6609.  volume.  Using a non-Lambda Activity worker might work, but we decided the overhead of
  6610.  managing a separate group of workers integrated with Step Functions would outweigh any
  6611.  benefit versus simply using Luigi.]&lt;/p&gt;
  6612.  
  6613. &lt;h4 id=&quot;example-luigi-volume-recovery&quot;&gt;Example: Luigi volume recovery&lt;/h4&gt;
  6614.  
  6615. &lt;p&gt;Here’s a simplified version of our Luigi-based recovery pipeline implementation:&lt;/p&gt;
  6616.  
  6617. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;RecoverLogs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;wrapperTask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6618.  &lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  6619.  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  6620.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;requires&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6621.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;UploadLogs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...),&lt;/span&gt;
  6622.            &lt;span class=&quot;n&quot;&gt;Cleanup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...)]&lt;/span&gt;
  6623.  
  6624. &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;UploadLogs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Task&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6625.  &lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  6626.  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  6627.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6628.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;contrib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;S3Target&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  6629.      &lt;span class=&quot;s&quot;&gt;&apos;s3://bucket/%(volume)s/_SUCCESS&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  6630.  
  6631.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;requires&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6632.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;mount&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MountVolume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(...)}&lt;/span&gt;
  6633.  
  6634.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6635.    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;yield&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RecoverOneFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  6636.                     &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;find_logs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  6637.                       &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;mount&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())}&lt;/span&gt;
  6638.    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  6639.      &lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dumps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  6640.                  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;items&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()}))&lt;/span&gt;
  6641.  
  6642. &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MountVolume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Task&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6643.  &lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  6644.  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  6645.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;requires&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6646.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;attach&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AttachAZLocalVolume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(...)}&lt;/span&gt;
  6647.  
  6648.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6649.    &lt;span class=&quot;n&quot;&gt;device&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;attach&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  6650.    &lt;span class=&quot;n&quot;&gt;mountpoint&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;calculate_mountpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  6651.    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;mount&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mountpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  6652.    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mountpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  6653.  
  6654.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;complete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6655.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_mounted&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
  6656.  
  6657.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;on_failure&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6658.    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;detach_and_unmount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  6659.    &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  6660.  
  6661. &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Cleanup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Task&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6662.  &lt;span class=&quot;n&quot;&gt;volume&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  6663.  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  6664.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;requires&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6665.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;UploadLogs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(...),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;UnmountDetachVolume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(...)]&lt;/span&gt;
  6666.  
  6667.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  6668.    &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  6669.    &lt;span class=&quot;k&quot;&gt;yield&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DeleteVolume&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(...)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6670.  
  6671. &lt;p&gt;In Luigi, a job is run by starting with the goal: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;luigi.build([RecoverLogs(volume_id,
  6672.  ...)])&lt;/code&gt;.  Luigi builds a graph of task dependencies and runs each runnable task which
  6673.  hasn’t yet completed (each of which can dynamically introduce additional subtasks),
  6674.  starting from the leaves (tasks with no unmet requirements).&lt;/p&gt;
  6675.  
  6676. &lt;p&gt;The end result is well-modularized and easily-testable code much like what we’d
  6677.  eventually arrive at after improving a non-Luigi version (but we just skipped to the end
  6678.  by using Luigi from the start).  Here’s what a successful recovery of logs from an EBS
  6679.  volume looks like in our system:&lt;/p&gt;
  6680.  
  6681. &lt;p&gt;&lt;img src=&quot;/images/post_images/spot-fleet/spot-savior-recovery.png&quot; alt=&quot;luigi volume recovery pipeline result&quot; /&gt;&lt;/p&gt;
  6682.  
  6683. &lt;p&gt;Green vertices denote successful tasks; edges express dependencies.  Any failing task
  6684.  results in the whole job failing and being retried; but only the subtasks which haven’t
  6685.  yet succeeded are re-run.&lt;/p&gt;
  6686.  
  6687. &lt;h4 id=&quot;putting-it-all-together&quot;&gt;Putting it all together&lt;/h4&gt;
  6688.  
  6689. &lt;p&gt;In practice Luigi has been a good fit for this scenario.  Here’s how we put it all
  6690.  together:&lt;/p&gt;
  6691.  
  6692. &lt;ol&gt;
  6693.  &lt;li&gt;First, when running on a spot instance our application records log volume information
  6694.  in a new DynamoDB table and configures its log volume(s) as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;delete-on-terminate=False&lt;/code&gt;,
  6695.  allowing them to become orphans following instance termination (such as due to a spot
  6696.  eviction or spot fleet scale-down event):
  6697.  &lt;img src=&quot;/images/post_images/spot-fleet/recovery-pipeline-1.png&quot; alt=&quot;recovery pipeline: use of log volumes table&quot; /&gt;
  6698.  This allows us to keep track of volumes which could actually have received any
  6699.  interesting application logs.&lt;/li&gt;
  6700.  &lt;li&gt;Second, a recovery component executing on a utility instance periodically scans this
  6701.  volume information table:
  6702.  &lt;img src=&quot;/images/post_images/spot-fleet/recovery-pipeline-2.png&quot; alt=&quot;recovery pipeline: dispatcher activity&quot; /&gt;
  6703.  Any volume in the table in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;available&lt;/code&gt; state is eligible for recovery, and a
  6704.  recovery message is dispatched to a regional queue.&lt;/li&gt;
  6705.  &lt;li&gt;Finally, an EC2 recovery worker (running in an ASG which scales according to the size
  6706.  of the queue) polls its regional queue for recovery messages:
  6707.  &lt;img src=&quot;/images/post_images/spot-fleet/recovery-pipeline-3.png&quot; alt=&quot;recovery pipeline: luigi pipeline activation&quot; /&gt;
  6708.  Having obtained a message, it executes the Luigi recovery pipeline for each, resulting
  6709.  in a volume’s logs being recovered and the volume being deleted only upon success.
  6710.  Naturally these workers also use spot instances!&lt;/li&gt;
  6711. &lt;/ol&gt;
  6712.  
  6713. &lt;p&gt;With a real workload, this setup can recover logs from about 650 EBS volumes per hour
  6714.  (in each region), largely limited by EC2 API throttling.  Our implementation always
  6715.  makes a copy of each volume before recovery to allow filesystem journals to be replayed
  6716.  without modifying the original volume.&lt;/p&gt;
  6717.  
  6718. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  6719.  
  6720. &lt;p&gt;We modified our real-time bidding application to run on over twenty different EC2
  6721.  instance types, and we’re currently using more than ten of those types in the spot
  6722.  fleets we created to replace our original autoscaling groups, giving a substantial
  6723.  amount of resistance to spot market fluctuations.&lt;/p&gt;
  6724.  
  6725. &lt;p&gt;To solve the problem of precious log data being lost on terminated spot instances, we
  6726.  implemented a Luigi-based log recovery pipeline to recover those logs from the EBS
  6727.  volumes now used by our application and surviving instance termination.&lt;/p&gt;
  6728.  
  6729. &lt;p&gt;As a result we’ve been able to realize substantial (~60%) savings; the effective cost of
  6730.  a compute unit is lower in the spot market relative to even the effective reserved
  6731.  instance price, and we now also no longer need to make any up-front reservations!&lt;/p&gt;
  6732.  
  6733. &lt;p&gt;Uncovering and overcoming the lack of feature parity between autoscaling groups and spot
  6734.  fleets was time-consuming, but overall the benefits far outweigh the cost (and the
  6735.  savings compound over time).&lt;/p&gt;
  6736.  
  6737. &lt;hr /&gt;
  6738.  
  6739. </description>
  6740.    </item>
  6741.    
  6742.    
  6743.    
  6744.    <item>
  6745.      <title>
  6746. Removing Erlang dead code with Xref
  6747. </title>
  6748.      <link>https://tech.nextroll.com/blog/dev/2018/10/09/remove-erlang-dead-code-xref.html</link>
  6749.      <pubDate>Tue, 09 Oct 2018 00:00:00 -0700</pubDate>
  6750.      <author></author>
  6751.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2018/10/09/remove-erlang-dead-code-xref</guid>
  6752.      <description>&lt;p&gt;Dead code (as in functions that are not used anywhere) tends to pile up in big projects if you leave them unattended. Using one of Xref’s most underrated features, you will be able to detect and remove what you do not need anymore.&lt;/p&gt;
  6753.  
  6754. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10-15 minute read&lt;/code&gt;&lt;/p&gt;
  6755.  
  6756. &lt;hr /&gt;
  6757.  
  6758. &lt;p&gt;We have already written several articles in this blog explaining how we use &lt;a href=&quot;http://www.erlang.org/&quot;&gt;Erlang/OTP&lt;/a&gt; extensively to build our &lt;a href=&quot;/blog/web/2014/04/29/valentino-presents-adrolls-rtb-infrastructure.html&quot;&gt;real-time bidding platform&lt;/a&gt; servers, among other things.&lt;/p&gt;
  6759.  
  6760. &lt;p&gt;These systems are big and they have been around for a long time by now. Just like any big old system, they contain some pieces of code that are not used anymore. To be clear: they are not broken, they are even properly covered by tests and all, but they’re not in use in production.&lt;/p&gt;
  6761.  
  6762. &lt;p&gt;In Erlang these pieces of &lt;em&gt;dead code&lt;/em&gt; manifest themselves as &lt;em&gt;unused functions&lt;/em&gt;. To be precise: they are &lt;em&gt;unused *exports*&lt;/em&gt; since unused but not-exported functions are detected at compile time.&lt;/p&gt;
  6763.  
  6764. &lt;p&gt;Finding unused exports in a big system can be tough. Luckily, Erlang/OTP already gives us a tool to do just that: &lt;a href=&quot;http://erlang.org/doc/man/xref.html&quot;&gt;&lt;em&gt;Xref&lt;/em&gt;&lt;/a&gt;.&lt;/p&gt;
  6765.  
  6766. &lt;blockquote&gt;
  6767.  &lt;p&gt;Xref is a cross reference tool that can be used for finding dependencies between functions, modules, applications and releases.&lt;/p&gt;
  6768. &lt;/blockquote&gt;
  6769.  
  6770. &lt;p&gt;If you manage your projects with &lt;a href=&quot;https://rebar3.org&quot;&gt;rebar3&lt;/a&gt;, you can use &lt;em&gt;Xref&lt;/em&gt; by simply running the following command:&lt;/p&gt;
  6771.  
  6772. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;rebar3 xref&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6773.  
  6774. &lt;p&gt;If you had not specified anything about &lt;em&gt;Xref&lt;/em&gt; in your rebar.config, that will check your entire project and perform all possible checks. Since, for big projects, the list of warnings that generates tends to be long, people usually have something like this in their configuration:&lt;/p&gt;
  6775.  
  6776. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xref_checks&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  6777.    &lt;span class=&quot;n&quot;&gt;undefined_function_calls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  6778.    &lt;span class=&quot;n&quot;&gt;locals_not_used&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  6779.    &lt;span class=&quot;n&quot;&gt;deprecated_function_calls&lt;/span&gt;
  6780. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6781.  
  6782. &lt;p&gt;In other words, with that list the report will only include:&lt;/p&gt;
  6783. &lt;ul&gt;
  6784.  &lt;li&gt;calls to functions that don’t exist (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;undefined_function_calls&lt;/code&gt;).&lt;/li&gt;
  6785.  &lt;li&gt;unused not-exported functions (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;locals_not_used&lt;/code&gt;).&lt;/li&gt;
  6786.  &lt;li&gt;calls to deprecated function (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deprecated_function_calls&lt;/code&gt;).&lt;/li&gt;
  6787. &lt;/ul&gt;
  6788.  
  6789. &lt;p&gt;You can find the full list of available checks in &lt;a href=&quot;https://www.rebar3.org/docs/configuration#section-xref&quot;&gt;rebar3 docs&lt;/a&gt;, but I want you to notice that since the last 2 are also detected by the compiler (if you have the proper warnings enabled) the only &lt;em&gt;effective&lt;/em&gt; check that’s being performed is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;undefined_function_calls&lt;/code&gt;. It’s a fine check to run, but it won’t help us with our original dead code issue.&lt;/p&gt;
  6790.  
  6791. &lt;p&gt;Let’s take a look at the checks we’re &lt;em&gt;not&lt;/em&gt; performing then. In general, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;undefined_functions&lt;/code&gt; will report the same results as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;undefined_function_calls&lt;/code&gt; but without the reference to the actual function call (Not very useful). Same goes for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deprecated_functions&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deprecated_function_calls&lt;/code&gt;. But then, we have &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exports_not_used&lt;/code&gt; which is &lt;em&gt;exactly the check we’re looking for&lt;/em&gt;.&lt;/p&gt;
  6792.  
  6793. &lt;p&gt;Adding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exports_not_used&lt;/code&gt; to our list of checks will emit a warning for each function that we have exported but not used anywhere. It’s amazing.&lt;/p&gt;
  6794.  
  6795. &lt;p&gt;But then why is nobody using it?🤔&lt;/p&gt;
  6796.  
  6797. &lt;p&gt;There are a few caveats when using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exports_not_used&lt;/code&gt;. I’ll list them now and I’ll tell you how to solve or at least &lt;em&gt;work around&lt;/em&gt; them.&lt;/p&gt;
  6798.  
  6799. &lt;h2 id=&quot;dynamically-evaluated-functions&quot;&gt;Dynamically Evaluated Functions&lt;/h2&gt;
  6800.  
  6801. &lt;p&gt;&lt;em&gt;Xref&lt;/em&gt; will report a warning for each function that is exported for which it &lt;em&gt;could not find&lt;/em&gt; where it’s used in the code. But, the fact that &lt;em&gt;Xref&lt;/em&gt; couldn’t find it, doesn’t mean that a place where the function is actually used doesn’t actually exist.
  6802. For instance, &lt;em&gt;Xref&lt;/em&gt; can’t deal with dynamic function calls but they’re perfectly valid.
  6803. So, let’s say you have a module that looks like the following one (don’t ask me why):&lt;/p&gt;
  6804.  
  6805. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sample&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  6806. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;exports&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;some_function&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some_other_function&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  6807.  
  6808. &lt;span class=&quot;nf&quot;&gt;some_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  6809.    &lt;span class=&quot;nv&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;some_other_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;an_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  6810.  
  6811. &lt;span class=&quot;nf&quot;&gt;some_other_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;called&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6812.  
  6813. &lt;p&gt;And some other module where you call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sample:some_function(sample)&lt;/code&gt;. &lt;em&gt;Xref&lt;/em&gt; will not be &lt;em&gt;smart enough&lt;/em&gt; to detect that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sample:some_other_function/1&lt;/code&gt; is actually used, because it’s only used through dynamic evaluation. And the one above is just one way of performing dynamic evaluation. You can see some others here:&lt;/p&gt;
  6814.  
  6815. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;c&quot;&gt;% Classic dynamic evaluation
  6816. &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Argument2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  6817.  
  6818. &lt;span class=&quot;c&quot;&gt;% Using erlang:apply/3
  6819. &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;erlang&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Arguments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  6820.  
  6821. &lt;span class=&quot;c&quot;&gt;% Using spawn[_link]/3
  6822. &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;erlang&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;spawn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Arguments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  6823.  
  6824. &lt;span class=&quot;c&quot;&gt;% Using timer:tc/3
  6825. &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Arguments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  6826.  
  6827. &lt;span class=&quot;c&quot;&gt;% In a supervisor spec
  6828. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ChildName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Arguments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;permanent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;worker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dynamic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6829.  
  6830. &lt;p&gt;Side note: if you add &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{xref_warnings, true}.&lt;/code&gt; to your rebar.config file, &lt;em&gt;Xref&lt;/em&gt; will at least print out warnings for these dynamic calls that it could not parse. Like this one:&lt;/p&gt;
  6831.  
  6832. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;nn&quot;&gt;sample&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unresolved&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;call&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6833.  
  6834. &lt;p&gt;In any case, as soon as your system becomes slightly bigger than a prototype, you’ll start having this unused exports everywhere. But don’t panic, there is actually a way to avoid those warnings, and it comes with some extra benefits. You can use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ignore_xref&lt;/code&gt;.&lt;/p&gt;
  6835.  
  6836. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-ignore_xref&lt;/code&gt; is an attribute you can add to your modules to prevent &lt;em&gt;Xref&lt;/em&gt; from emitting warnings about certain functions. It looks like this:&lt;/p&gt;
  6837.  
  6838. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sample&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  6839. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;exports&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;some_function&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some_other_function&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  6840.  
  6841. &lt;span class=&quot;c&quot;&gt;%% This function should be dynamically invoked through sample:some_function/1
  6842. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;ignore_xref&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;MODULE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some_other_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]).&lt;/span&gt;
  6843.  
  6844. &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6845.  
  6846. &lt;p&gt;Now, if you go check the Xref docs from OTP, you won’t find any mention to it. That’s because it’s not an &lt;em&gt;official&lt;/em&gt; attribute.
  6847. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ignore_xref&lt;/code&gt; is an undocumented feature of &lt;em&gt;rebar3 xref&lt;/em&gt; (and also &lt;a href=&quot;https://hex.pm/packages/xref_runner&quot;&gt;xref_runner&lt;/a&gt;). It’s an attribute you can add to your modules where you can list all the functions you don’t want &lt;em&gt;Xref&lt;/em&gt; to complain about. The syntax is as follows:&lt;/p&gt;
  6848.  
  6849. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;ignore_xref&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([{&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;arity&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()}&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()}]).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6850.  
  6851. &lt;p&gt;Using this, you can effectively remove all warnings related to functions that are exported so they can be dynamically evaluated. As a bonus, as you can see in our example above, you can also use this place in the code to add some documentation stating where these functions are expected to be used.&lt;/p&gt;
  6852.  
  6853. &lt;h2 id=&quot;dynamically-generated-code&quot;&gt;Dynamically Generated Code&lt;/h2&gt;
  6854.  
  6855. &lt;p&gt;You might not be a fan of dynamically generated code, but sometimes there is no way around it.&lt;/p&gt;
  6856.  
  6857. &lt;p&gt;For instance, we use protocol buffers in several places and that lead us to use &lt;a href=&quot;https://hex.pm/packages/gpb&quot;&gt;gpb&lt;/a&gt; and its rebar3 plugin. When &lt;em&gt;gpb&lt;/em&gt; is writing a module, it doesn’t know how it will be used so it has no way to tell if certain functions should not be exported/are not needed. That means when we run &lt;em&gt;Xref&lt;/em&gt; we get warnings about all the unused exports generated by it.&lt;/p&gt;
  6858.  
  6859. &lt;p&gt;How to avoid those warnings? We can’t use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-ignore_xref&lt;/code&gt; since we’re not writing those modules. Turns out, there is another way. We can use the dyslexically named &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;xref_ignores&lt;/code&gt; attribute in rebar.config. It basically allows you to have a global list of functions to ignore everywhere. It looks like this:&lt;/p&gt;
  6860.  
  6861. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xref_ignores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  6862.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my_gpb_generated_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  6863.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my_gpb_generated_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some_other_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  6864.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my_gpb_generated_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a_function_with_various_arities&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  6865.    &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  6866. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6867.  
  6868. &lt;p&gt;There is no way yet to ignore &lt;em&gt;all&lt;/em&gt; the functions in a module, but I already wrote &lt;a href=&quot;https://github.com/erlang/rebar3/issues/1905&quot;&gt;a ticket&lt;/a&gt; requesting it. Maybe you can tackle that as a &lt;a href=&quot;https://hacktoberfest.digitalocean.com/&quot;&gt;#hacktoberfest&lt;/a&gt; project?&lt;/p&gt;
  6869.  
  6870. &lt;h2 id=&quot;functions-exported-to-be-used-externally&quot;&gt;Functions Exported to be used Externally&lt;/h2&gt;
  6871.  
  6872. &lt;p&gt;What if you have some functions that are only exported because you use them in the shell when you log in remotely to your servers in production? What if they’re used by external scripts that execute &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rpc&lt;/code&gt; calls into your nodes or stuff like that?&lt;/p&gt;
  6873.  
  6874. &lt;p&gt;Well, in that case, I encourage you to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-ignore_xref&lt;/code&gt; and add a proper comment there stating how/when/where those functions are expected to be used. It will pay off in the future, I promise.&lt;/p&gt;
  6875.  
  6876. &lt;h2 id=&quot;functions-exported-just-for-tests&quot;&gt;Functions Exported just for Tests&lt;/h2&gt;
  6877.  
  6878. &lt;p&gt;A different situation I’ve seen sometimes (particularly when people work &lt;a href=&quot;https://www.goodreads.com/book/show/44919.Working_Effectively_with_Legacy_Code&quot;&gt;with Legacy Code&lt;/a&gt;) are functions that are exported so they can be used in tests. The idea is to either mock them or to have access to some internal logic that should otherwise be hidden to the system in production.&lt;/p&gt;
  6879.  
  6880. &lt;p&gt;First of all, if you’re using &lt;a href=&quot;http://erlang.org/doc/man/eunit.html&quot;&gt;eunit&lt;/a&gt; you don’t need to export them. You can use not-exported functions in your tests.&lt;/p&gt;
  6881.  
  6882. &lt;p&gt;Now, if you use common test or other frameworks that require tests to be written outside the module under test, that’s a different story. I think it’s important to consider that exporting functions just to use them in tests is something that’s undesirable in general since…&lt;/p&gt;
  6883.  
  6884. &lt;blockquote&gt;
  6885.  &lt;ul&gt;
  6886.    &lt;li&gt;If your functions are not exported and unused they are detected by the compiler, as we stated before, allowing you to find bugs much earlier.&lt;/li&gt;
  6887.    &lt;li&gt;If you are adding functions that should not be available in production and/or doing stuff that should not be done in production, then your test is not simulating the real scenario accurately which may lead to test passing with code that still doesn’t work as expected.&lt;/li&gt;
  6888.  &lt;/ul&gt;
  6889. &lt;/blockquote&gt;
  6890.  
  6891. &lt;p&gt;Still, sometimes there is just no way around it: You need to mock some stuff that is otherwise invisible to the external world, you need to verify some data that’s only exposed in very complex formats or detect side effects that are really hard to capture. In those scenarios, once again, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ignore_xref&lt;/code&gt; and a neat comment are wonderful tools to avoid surprises and frustration for future developers that, finding an unused function, decide to remove it.&lt;/p&gt;
  6892.  
  6893. &lt;h2 id=&quot;library-facades&quot;&gt;Library Facades&lt;/h2&gt;
  6894.  
  6895. &lt;p&gt;Finally, there is one other scenario where you do need to export functions that you don’t actually use within your application: When your application is a library (i.e. when you’re building an app to be used as a dependency in other systems). In that case, some functions constitute the facade of your app and they’re not to be used by it. They’re exposed so your users can invoke them in their apps.&lt;/p&gt;
  6896.  
  6897. &lt;p&gt;Those functions will all be reported as unused exports and it’s not nice to have to write &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ignore_xref&lt;/code&gt; / &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;xref_ignores&lt;/code&gt; for all of them. But what is really nice is to cover them in tests. And if you do that, you have a way to avoid the warnings and actually only generate warnings for functions that are exported, unused and &lt;em&gt;not tested&lt;/em&gt;. You can run xref as…&lt;/p&gt;
  6898.  
  6899. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;rebar3 as &lt;span class=&quot;nb&quot;&gt;test &lt;/span&gt;xref&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6900.  
  6901. &lt;p&gt;Using the test profile, rebar3 will include all your test modules into the analysis and, since your facade functions will be used there, it will not warn you about them.&lt;/p&gt;
  6902.  
  6903. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  6904.  
  6905. &lt;p&gt;While &lt;em&gt;Xref&lt;/em&gt; is a powerful tool, it needs some tweaks to extract its full potential.&lt;/p&gt;
  6906.  
  6907. &lt;p&gt;First of all, you have to use the right checks. Our recommended list is…&lt;/p&gt;
  6908.  
  6909. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xref_checks&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  6910.    &lt;span class=&quot;n&quot;&gt;undefined_function_calls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  6911.    &lt;span class=&quot;n&quot;&gt;exports_not_used&lt;/span&gt;
  6912. &lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  6913.  
  6914. &lt;p&gt;Then you have to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-ignore_xref&lt;/code&gt; attributes and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;xref_ignores&lt;/code&gt; configuration param appropriately to identify all the functions that are intentionally exported and unused. If you’re writing a library, you should also consider your tests in the analysis using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rebar3 as test xref&lt;/code&gt;.&lt;/p&gt;
  6915.  
  6916. &lt;p&gt;With all that in place, you should expect to have 0 warnings reported and therefore you can be sure that you don’t have any dead code in your project.&lt;/p&gt;
  6917.  
  6918. &lt;p&gt;Well… actually… You don’t have any &lt;em&gt;dead functions&lt;/em&gt; (unused exports). Which is a lot, but you can still have dead code in the form of unused function clauses, unused case clauses, etc… &lt;em&gt;Xref&lt;/em&gt; will not detect those problems.&lt;/p&gt;
  6919.  
  6920. &lt;p&gt;For that, you need a much more powerful tool: &lt;a href=&quot;http://erlang.org/doc/man/dialyzer.html&quot;&gt;dialyzer&lt;/a&gt;. We won’t cover it in this article, but stay tuned…&lt;/p&gt;
  6921.  
  6922. &lt;hr /&gt;
  6923.  
  6924. &lt;p&gt;&lt;strong&gt;Do you enjoy building high-quality large-scale systems? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  6925.  
  6926. </description>
  6927.    </item>
  6928.    
  6929.    
  6930.    
  6931.    <item>
  6932.      <title>Running large batch processing pipelines on AWS Batch</title>
  6933.      <link>https://tech.nextroll.com/blog/data/2018/08/08/running-jobs-with-aws-batch.html</link>
  6934.      <pubDate>Wed, 08 Aug 2018 00:00:00 -0700</pubDate>
  6935.      <author></author>
  6936.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2018/08/08/running-jobs-with-aws-batch</guid>
  6937.      <description>&lt;p&gt;The attribution team at AdRoll computes metrics out of petabytes of data every
  6938. night. This is accomplished using a batch processing pipeline that submits jobs
  6939. to AWS Batch. In this blog post we discuss how this is organized and
  6940. orchestrated with Luigi. We announce Batchiepatchie, a job monitoring tool for
  6941. AWS Batch. Batchiepatchie is an improvement over Amazon’s own monitoring
  6942. solution for AWS Batch and it has saved us countless hours of engineer time.&lt;/p&gt;
  6943.  
  6944. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10-15 minute read&lt;/code&gt;&lt;/p&gt;
  6945.  
  6946. &lt;hr /&gt;
  6947.  
  6948. &lt;p&gt;To start off, &lt;a href=&quot;https://github.com/AdRoll/batchiepatchie&quot;&gt;we have open sourced Batchiepatchie that is discussed in this
  6949. blog post&lt;/a&gt;. Read on to understand
  6950. what this piece of technology is!&lt;/p&gt;
  6951.  
  6952. &lt;p&gt;Before we delve into technicals of batch processing, I want to establish some
  6953. real world context. Our batch pipeline was created for the purpose of doing
  6954. something called &lt;em&gt;attribution&lt;/em&gt; in Internet advertising. What is attribution, in
  6955. adtech sense? To put it very simply, it is the problem of figuring out which
  6956. advertising events deserve credit when people purchase things. For
  6957. example, if we show an advertisement to a person about a T-shirt and one day
  6958. later they buy said T-shirt, we can say that advertisement impression deserves
  6959. some credit for making that purchase happen. Attribution is used to understand
  6960. how some advertising campaign is working out.&lt;/p&gt;
  6961.  
  6962. &lt;p&gt;The problem of how much credit should be given and in which circumstances and how
  6963. to report on it is a complicated topic and is a story for another time. This
  6964. post is about the batch job infrastructure that can compute metrics around this
  6965. problem by making use of vast amounts of data.&lt;/p&gt;
  6966.  
  6967. &lt;p&gt;To compute attribution metrics, we need to be able to look at the history of
  6968. all &lt;em&gt;event trails&lt;/em&gt; of each user we track. An event trail contains all the
  6969. relevant information of each user for attribution purposes, such as when we
  6970. displayed advertisement from certain advertising campaigns and when purchases
  6971. were made on a customer’s website. From this information, we can compute how
  6972. much credit each advertising campaign deserves.&lt;/p&gt;
  6973.  
  6974. &lt;p&gt;At AdRoll’s scale, this is a non-trivial amount of data. All the attribution
  6975. metrics must be computed on a daily basis. Even a big EC2 instance cannot process
  6976. all of it in a reasonable time; just downloading all the required data in
  6977. compressed format would take too long with a single box. Thus, we must
  6978. distribute the problem to many computers. This is where AWS Batch comes in.&lt;/p&gt;
  6979.  
  6980. &lt;h4 id=&quot;aws-batch&quot;&gt;AWS Batch&lt;/h4&gt;
  6981.  
  6982. &lt;p&gt;So what is AWS Batch anyway? You can read the &lt;a href=&quot;https://aws.amazon.com/batch/&quot;&gt;official description at Amazon’s
  6983. website&lt;/a&gt; but for our purposes it can be described as follows.&lt;/p&gt;
  6984.  
  6985. &lt;p&gt;AWS Batch is a system that you submit jobs into. AWS Batch runs these jobs on
  6986. EC2 instances. AWS Batch scales up a bunch of instances as needed so that the
  6987. jobs can run. Once all the jobs are done, the instances are put down. This way,
  6988. you will pay for instances only when you actually have some jobs running. These days
  6989. &lt;a href=&quot;https://aws.amazon.com/blogs/aws/new-per-second-billing-for-ec2-instances-and-ebs-volumes/&quot;&gt;AWS has per-second billing of EC2 instances&lt;/a&gt;
  6990. , which makes fast scaling a critical feature in terms of minimizing your AWS bill.&lt;/p&gt;
  6991.  
  6992. &lt;p&gt;Another way to describe this process is that you give AWS Batch a Docker image URL, some
  6993. command line arguments and CPU and memory requirements and AWS Batch will
  6994. figure out how to run your job in some way.&lt;/p&gt;
  6995.  
  6996. &lt;p&gt;AWS Batch is a relatively simple way to distribute large amounts of batch jobs
  6997. onto lots of EC2 instances and in a way that you only pay when you actually
  6998. have jobs running.&lt;/p&gt;
  6999.  
  7000. &lt;h3 id=&quot;example-of-a-batch-job-pipeline&quot;&gt;Example of a batch job pipeline&lt;/h3&gt;
  7001.  
  7002. &lt;p&gt;So, how do you use the AWS Batch system to split off your massive batch
  7003. processing so that bunch of boxes will run your job instead of just one box? I
  7004. thought one of the most illustrative ways to describe how our system works is
  7005. to show a representative example of actual technologies we use to do this.&lt;/p&gt;
  7006.  
  7007. &lt;p&gt;The most important technologies involved with this process is Amazon S3, Docker
  7008. and &lt;a href=&quot;https://github.com/spotify/luigi&quot;&gt;Luigi&lt;/a&gt; and our internal (&lt;a href=&quot;https://github.com/AdRoll/batchiepatchie/&quot;&gt;but now open
  7009. sourced!&lt;/a&gt;) AWS Batch dashboard
  7010. called “Batchiepatchie”.&lt;/p&gt;
  7011.  
  7012. &lt;h4 id=&quot;luigi&quot;&gt;Luigi&lt;/h4&gt;
  7013.  
  7014. &lt;p&gt;At AdRoll’s attribution team we use a Python library called Luigi to orchestrate
  7015. tasks. &lt;a href=&quot;/blog/data/2015/10/15/luigi.html&quot;&gt;We have talked about Luigi before in our blog&lt;/a&gt; so
  7016. you can read our older blog post for some more background on that topic; most
  7017. of it is still relevant for this Luigi section, although we use a slightly
  7018. different set of technologies than we did back in 2015.&lt;/p&gt;
  7019.  
  7020. &lt;p&gt;Let’s say that hypothetically (or not so hypothetically, this example is a
  7021. simplified version of a real-world job) we want to compute the last click
  7022. credit on advertising campaigns using
  7023. &lt;a href=&quot;http://traildb.io&quot;&gt;TrailDBs&lt;/a&gt; as our source of data.&lt;/p&gt;
  7024.  
  7025. &lt;p&gt;We have TrailDB files in S3 in an organized directory structure, where each day
  7026. has its own set of TrailDBs. For example, we have a type of attribution TrailDB
  7027. that contains data relevant for attribution purposes.&lt;/p&gt;
  7028.  
  7029. &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;aws s3 &lt;span class=&quot;nb&quot;&gt;ls &lt;/span&gt;s3://example-bucket/traildbs/attributiondb/2018-07-26/
  7030. 2018-07-27 18:24:03  attributiondb-0.tdb
  7031. 2018-07-27 18:24:39  attributiondb-1.tdb
  7032. 2018-07-27 18:25:07  attributiondb-2.tdb
  7033. 2018-07-27 18:23:38  attributiondb-3.tdb
  7034. 2018-07-27 18:18:41  attributiondb-4.tdb
  7035. 2018-07-27 18:55:22  attributiondb-5.tdb
  7036. 2018-07-27 18:36:03  attributiondb-6.tdb
  7037. 2018-07-27 18:10:10  attributiondb-7.tdb
  7038. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  7039.  
  7040. &lt;p&gt;In the above example, we have 8 files. These files contain all data for that
  7041. particular day; in this example, that day is 2018-07-26.&lt;/p&gt;
  7042.  
  7043. &lt;p&gt;This naturally distributes to at least 8 workers: just submit one batch job per
  7044. file. One worker processes one file.&lt;/p&gt;
  7045.  
  7046. &lt;p&gt;We will look at how to do this with some Luigi, S3 and AWS Batch.&lt;/p&gt;
  7047.  
  7048. &lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# lastclick.py
  7049. &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;luigi.s3&lt;/span&gt;
  7050. &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;luigi&lt;/span&gt;
  7051.  
  7052. &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AttributionLastClickJob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;WrapperTask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  7053.    &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;             &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DateParameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  7054.    &lt;span class=&quot;n&quot;&gt;s3_input_prefix&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  7055.    &lt;span class=&quot;n&quot;&gt;s3_output_prefix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  7056.    &lt;span class=&quot;n&quot;&gt;shards&lt;/span&gt;           &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IntParameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  7057.  
  7058.    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;requires&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  7059.        &lt;span class=&quot;n&quot;&gt;sub_jobs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
  7060.        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shard&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shards&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  7061.            &lt;span class=&quot;n&quot;&gt;sub_jobs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AttributionLastClickShard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  7062.                &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7063.                &lt;span class=&quot;n&quot;&gt;s3_input_prefix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3_input_prefix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7064.                &lt;span class=&quot;n&quot;&gt;s3_output_prefix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3_output_prefix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7065.                &lt;span class=&quot;n&quot;&gt;shard&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  7066.        &lt;span class=&quot;k&quot;&gt;yield&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sub_jobs&lt;/span&gt;
  7067. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  7068.  
  7069. &lt;p&gt;Let us first go through what is happening in this file. We have four arguments
  7070. to our task, &lt;em&gt;date&lt;/em&gt;, &lt;em&gt;input prefix&lt;/em&gt;, &lt;em&gt;output prefix&lt;/em&gt; and &lt;em&gt;shards&lt;/em&gt;. These
  7071. specify the date we are processing, an S3 input prefix where our input data is
  7072. located, an S3 output prefix where we should put output data and the number of
  7073. shards we have (in this example, shards is 8).&lt;/p&gt;
  7074.  
  7075. &lt;p&gt;For this example, we could have, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;date=2018-07-26&lt;/code&gt;,
  7076. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s3_input_prefix=s3://example-bucket/traildb/attributiondb&lt;/code&gt;,
  7077. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s3_output_prefix=s3://example-bucket/attribution-results&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;shards=8&lt;/code&gt;.&lt;/p&gt;
  7078.  
  7079. &lt;p&gt;This particular Luigi task is a &lt;em&gt;wrapper&lt;/em&gt; task which means it does not run jobs
  7080. by itself, it only requires other tasks and those other tasks do the actual
  7081. work. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AttributionLastClickJob&lt;/code&gt; wants to distribute the 8 input shards into 8
  7082. batch jobs. We collect each job shard into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sub_jobs&lt;/code&gt; list. When we do &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;yield
  7083. sub_jobs&lt;/code&gt;, Luigi will &lt;em&gt;concurrently&lt;/em&gt; run all the jobs listed in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sub_jobs&lt;/code&gt;
  7084. list.&lt;/p&gt;
  7085.  
  7086. &lt;p&gt;The task class above refers to something called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AttributionLastClickShard&lt;/code&gt;. Here is the source code for that:&lt;/p&gt;
  7087.  
  7088. &lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# lastclick.py - continued
  7089. &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;
  7090. &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pybatch&lt;/span&gt;
  7091.  
  7092. &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AttributionLastClickShard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Task&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  7093.    &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;             &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DateParameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  7094.    &lt;span class=&quot;n&quot;&gt;s3_input_prefix&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  7095.    &lt;span class=&quot;n&quot;&gt;s3_output_prefix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  7096.    &lt;span class=&quot;n&quot;&gt;shard&lt;/span&gt;            &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IntParameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  7097.  
  7098.    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  7099.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;S3Target&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  7100.            &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3_output_prefix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7101.                         &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strftime&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%Y-%m-%d&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  7102.                         &lt;span class=&quot;s&quot;&gt;&quot;output.{}.sqlite3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
  7103.  
  7104.    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  7105.        &lt;span class=&quot;n&quot;&gt;cmdline&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  7106.            &lt;span class=&quot;s&quot;&gt;&quot;do_lastclick_attribution.py&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7107.            &lt;span class=&quot;s&quot;&gt;&quot;--input-prefix&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3_input_prefix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7108.            &lt;span class=&quot;s&quot;&gt;&quot;--output-prefix&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3_output_prefix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7109.            &lt;span class=&quot;s&quot;&gt;&quot;--shard&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard&lt;/span&gt;
  7110.        &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  7111.        &lt;span class=&quot;n&quot;&gt;pybatch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_on_awsbatch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;container&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;attribution:1.0&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7112.                                &lt;span class=&quot;n&quot;&gt;cmdline&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmdline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7113.                                &lt;span class=&quot;n&quot;&gt;cpus&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7114.                                &lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;110000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  7115.                                &lt;span class=&quot;n&quot;&gt;jobqueue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;attribution-job-queue&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  7116.  
  7117. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;__main__&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  7118.    &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  7119. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  7120.  
  7121. &lt;p&gt;This is where submitting the job to AWS Batch happens. In the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;output()&lt;/code&gt;
  7122. method, we define what result this job will ultimately create; in this example,
  7123. it is an SQLite file in S3.&lt;/p&gt;
  7124.  
  7125. &lt;p&gt;In &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run()&lt;/code&gt; method, we define a command line that runs a Python script
  7126. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;do_lastclick_attribution.py&lt;/code&gt;. This command line is passed to AWS Batch and it
  7127. describes the command line arguments we pass to a Docker container when AWS
  7128. Batch runs it.&lt;/p&gt;
  7129.  
  7130. &lt;p&gt;The call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pybatch.run_on_awsbatch&lt;/code&gt; actually submits the job to AWS Batch. This
  7131. is an AdRoll-internal job submitting function that knows about AdRoll
  7132. infrastructure and simplifies the submission process. In your application, dear
  7133. reader, you would likely use &lt;a href=&quot;https://boto3.readthedocs.io/en/latest/reference/services/batch.html&quot;&gt;boto Python libraries to do
  7134. this&lt;/a&gt;.&lt;/p&gt;
  7135.  
  7136. &lt;p&gt;This function will only return when the job has completed successfully. If the
  7137. job fails, then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_on_awsbatch&lt;/code&gt; will throw an exception.&lt;/p&gt;
  7138.  
  7139. &lt;p&gt;Running this job is not difficult. Just invoke the script with some arguments:&lt;/p&gt;
  7140.  
  7141. &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;chmod&lt;/span&gt; +x lastclick.py
  7142. &lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./lastclick.py &lt;span class=&quot;nt&quot;&gt;--date&lt;/span&gt;             2018-07-26 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  7143.                 &lt;span class=&quot;nt&quot;&gt;--s3-input-prefix&lt;/span&gt;  s3://example-bucket/traildbs/attributiondbs &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  7144.                 &lt;span class=&quot;nt&quot;&gt;--s3-output-prefix&lt;/span&gt; s3://example-bucket/attributions &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  7145.                 &lt;span class=&quot;nt&quot;&gt;--shards&lt;/span&gt;           8 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  7146.                 &lt;span class=&quot;nt&quot;&gt;--workers&lt;/span&gt;          8
  7147. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  7148.  
  7149. &lt;p&gt;This will invoke our distributed job, with Luigi handling concurrent submission to AWS Batch.&lt;/p&gt;
  7150.  
  7151. &lt;p&gt;Most of our real batch jobs have been set up in this manner. There are some additional complexities
  7152. with our real systems I did not elaborate on in this example, such as:&lt;/p&gt;
  7153.  
  7154. &lt;ul&gt;
  7155.  &lt;li&gt;
  7156.    &lt;p&gt;We have an automatic crontab generator that will set up a scheduling box
  7157. based on Luigi task files in a git repository. It will (try to) run all
  7158. jobs every 5 minutes. We rely on Luigi to stop duplicate jobs from running and stopping
  7159. any jobs that have already completed.&lt;/p&gt;
  7160.  &lt;/li&gt;
  7161.  &lt;li&gt;
  7162.    &lt;p&gt;All jobs have been designed to be idempotent. If some job fails then we can safely resubmit it.&lt;/p&gt;
  7163.  &lt;/li&gt;
  7164.  &lt;li&gt;
  7165.    &lt;p&gt;AWS Batch wants you to specify something called “job definitions”. If you
  7166. have worked with Amazon ECS, then these are AWS Batch equivalents to ECS
  7167. Task Definitions. The internal function &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pybatch.run_on_awsbatch&lt;/code&gt; creates job definitions
  7168. on the fly for given arguments.&lt;/p&gt;
  7169.  &lt;/li&gt;
  7170.  &lt;li&gt;
  7171.    &lt;p&gt;Almost all batch jobs we have use S3 as the data store; inputs are pulled
  7172. from S3 and pushed to some other location in S3. All data is immutable. In
  7173. some cases outputs are pushed to a PostgreSQL database instead.
  7174. &lt;a href=&quot;/blog/data/2016/05/24/traildb-open-sourced.html&quot;&gt;We’ve had some blog posts related to immutable data before in our blog.&lt;/a&gt;&lt;/p&gt;
  7175.  &lt;/li&gt;
  7176.  &lt;li&gt;
  7177.    &lt;p&gt;The actual amount of data we have is &lt;em&gt;much&lt;/em&gt; more than just one day with 8
  7178. shard files ;) We also have &lt;em&gt;hundreds&lt;/em&gt; of different batch jobs, using a
  7179. diverse set of technologies.&lt;/p&gt;
  7180.  &lt;/li&gt;
  7181. &lt;/ul&gt;
  7182.  
  7183. &lt;h4 id=&quot;batchiepatchie&quot;&gt;Batchiepatchie&lt;/h4&gt;
  7184.  
  7185. &lt;p&gt;Once jobs have been submitted, they will eventually run and (hopefully) will do
  7186. their job, but sometimes batch jobs will fail. Perhaps someone introduced a bug
  7187. in the code or the EC2 instance running the jobs failed for some reason. Maybe
  7188. some job became really slow because of quirks in the data.&lt;/p&gt;
  7189.  
  7190. &lt;p&gt;To investigate and debug issues like this, you need monitoring.&lt;/p&gt;
  7191.  
  7192. &lt;p&gt;Unfortunately, we feel the AWS Batch console in AWS Management Console leaves a
  7193. lot to be desired, especially if you really scale up your batch job use.&lt;/p&gt;
  7194.  
  7195. &lt;ul&gt;
  7196.  &lt;li&gt;
  7197.    &lt;p&gt;It is difficult to get a holistic view of all jobs in the system. The UI in the management console
  7198. will not show all jobs neatly in a single view.&lt;/p&gt;
  7199.  &lt;/li&gt;
  7200.  &lt;li&gt;
  7201.    &lt;p&gt;The management console forgets about jobs after some time.&lt;/p&gt;
  7202.  &lt;/li&gt;
  7203.  &lt;li&gt;
  7204.    &lt;p&gt;Searching for jobs based on image name or command line arguments is hard.
  7205. In fact, searching for any particular job at all is hard.&lt;/p&gt;
  7206.  &lt;/li&gt;
  7207.  &lt;li&gt;
  7208.    &lt;p&gt;Logs for any batch job is in CloudWatch logs; you have to painstakingly navigate
  7209. around the management console just to see job logs.&lt;/p&gt;
  7210.  &lt;/li&gt;
  7211. &lt;/ul&gt;
  7212.  
  7213. &lt;p&gt;If we only had a few jobs per day it would not be that bad. However, AdRoll
  7214. submits tens of thousands of batch jobs per day through AWS Batch. It was very clear we
  7215. needed a much better monitoring solution as we started to scale up.&lt;/p&gt;
  7216.  
  7217. &lt;p&gt;And this is why we created
  7218. &lt;a href=&quot;https://github.com/AdRoll/batchiepatchie&quot;&gt;Batchiepatchie&lt;/a&gt;. Batchiepatchie is a
  7219. monitoring tool for AWS Batch.&lt;/p&gt;
  7220.  
  7221. &lt;p&gt;&lt;img alt=&quot;Batchiepatchie screenshot&quot; src=&quot;/images/post_images/batchiepatchie_screenshot.png&quot; /&gt;&lt;/p&gt;
  7222.  
  7223. &lt;p&gt;So what does Batchiepatchie do? Pedantically speaking, not much more than AWS
  7224. Management Console. It just makes certain use cases much faster.&lt;/p&gt;
  7225.  
  7226. &lt;ul&gt;
  7227.  &lt;li&gt;
  7228.    &lt;p&gt;Batchiepatchie has a little search box that takes freeform text to find
  7229. jobs. We spent some time making this feature fast even with millions of
  7230. jobs in job history. This is by far the most important feature in
  7231. Batchiepatchie.&lt;/p&gt;
  7232.  &lt;/li&gt;
  7233.  &lt;li&gt;
  7234.    &lt;p&gt;Batchiepatchie does not forget about jobs. It does not forget about your mistakes either. Ever.&lt;/p&gt;
  7235.  &lt;/li&gt;
  7236.  &lt;li&gt;
  7237.    &lt;p&gt;Clicking on a job gives you an instant view of its standard output and error.&lt;/p&gt;
  7238.  &lt;/li&gt;
  7239.  &lt;li&gt;
  7240.    &lt;p&gt;Terminating jobs is easy; select them on the dashboard and kill terminate job.&lt;/p&gt;
  7241.  &lt;/li&gt;
  7242.  &lt;li&gt;
  7243.    &lt;p&gt;Batchiepatchie figures out IP address of the instances where batch jobs
  7244. run, making it easier to SSH inside a worker to monitor a job in more
  7245. detail when necessary.&lt;/p&gt;
  7246.  &lt;/li&gt;
  7247.  &lt;li&gt;
  7248.    &lt;p&gt;It is easy to link a job page to a coworker.&lt;/p&gt;
  7249.  &lt;/li&gt;
  7250.  &lt;li&gt;
  7251.    &lt;p&gt;Batchiepatchie implements timeouts for AWS Batch, something that is not
  7252. supported out of box. This requires some special logic from the code that
  7253. submits the batch jobs.&lt;/p&gt;
  7254.  &lt;/li&gt;
  7255. &lt;/ul&gt;
  7256.  
  7257. &lt;p&gt;When you have tens of thousands of jobs per day, these features can
  7258. (figuratively) save your life. Finding some specific failed job out of
  7259. thousands of succeeded jobs is much easier than trying to find it with AWS
  7260. Batch’s own user interface. Batchiepatchie became so useful at AdRoll that I
  7261. often see my coworkers refer to AWS Batch as “Batchiepatchie” even though our
  7262. monitoring tool is just, you know, a monitoring tool.&lt;/p&gt;
  7263.  
  7264. &lt;p&gt;Batchiepatchie has some more features than listed here; I encourage you to look
  7265. inside the git repository and read through its documentation if you are
  7266. interested.&lt;/p&gt;
  7267.  
  7268. &lt;h4 id=&quot;concluding-notes&quot;&gt;Concluding notes&lt;/h4&gt;
  7269.  
  7270. &lt;p&gt;AdRoll is having a good experience with AWS Batch. Submitting thousands of jobs
  7271. per day is not a problem in itself; AWS Batch scales up quickly but Amazon’s
  7272. own monitoring solution is not great, which is why we decided to create a
  7273. custom monitoring tool designed to be great at finding specific jobs.&lt;/p&gt;
  7274.  
  7275. &lt;p&gt;As we said, we use Luigi to organize our batch pipeline. AWS Batch does support
  7276. a simple form of jobs depending on other jobs but we almost always depend on
  7277. Luigi to handle our dependencies for us instead.&lt;/p&gt;
  7278.  
  7279. &lt;p&gt;Compared to some other tools like Hadoop or Spark, AWS Batch is Docker-only.
  7280. This lets you use some pretty exotic technologies inside your batch jobs if you
  7281. wish to do so. We have Python-written jobs, C and C++ jobs, R jobs, Java jobs
  7282. and even some Rust and Haskell jobs, sometimes mixed together in one Docker
  7283. container. You can also select which EC2 instance types you want and get huge
  7284. boxes to do your computation.  If you also use spot instances, you can get
  7285. these boxes cheap. At AdRoll, we even modified our system images to allow us to
  7286. memory map more files in a batch job than the Linux kernel allows by default for
  7287. some truly intensive batch jobs that memory map lots and lots of files.&lt;/p&gt;
  7288.  
  7289. &lt;p&gt;If you need to build a batch processing pipeline and you like Docker a lot, AWS
  7290. Batch could be for you. It is a cost effective and a flexible batch job system.
  7291. We have saved a lot of engineering time by having a reliable batch job system
  7292. that can be used with a diverse set of technologies. Maybe you can save some of
  7293. your time as well!&lt;/p&gt;
  7294.  
  7295. &lt;h4 id=&quot;links&quot;&gt;Links&lt;/h4&gt;
  7296.  
  7297. &lt;ul&gt;
  7298.  &lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/batch/&quot;&gt;AWS Batch&lt;/a&gt;&lt;/li&gt;
  7299.  &lt;li&gt;&lt;a href=&quot;https://github.com/spotify/luigi&quot;&gt;Luigi&lt;/a&gt;&lt;/li&gt;
  7300.  &lt;li&gt;&lt;a href=&quot;https://github.com/adroll/batchiepatchie&quot;&gt;Batchiepatchie&lt;/a&gt;&lt;/li&gt;
  7301.  &lt;li&gt;&lt;a href=&quot;http://traildb.io&quot;&gt;TrailDB&lt;/a&gt;&lt;/li&gt;
  7302. &lt;/ul&gt;
  7303.  
  7304. &lt;hr /&gt;
  7305.  
  7306. &lt;p&gt;&lt;strong&gt;Is your goal in life to build the ultimate batch job pipeline? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  7307.  
  7308. </description>
  7309.    </item>
  7310.    
  7311.    
  7312.    
  7313.    <item>
  7314.      <title>If you build it, they will come.</title>
  7315.      <link>https://tech.nextroll.com/blog/dev/2018/07/25/python-meetup-2018.html</link>
  7316.      <pubDate>Wed, 25 Jul 2018 00:00:00 -0700</pubDate>
  7317.      <author></author>
  7318.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2018/07/25/python-meetup-2018</guid>
  7319.      <description>&lt;p&gt;When the leadership team decided to move AdRoll HQ to the spacious new location in the Mission,
  7320. one has to wonder whether they had events such as last Wednesday’s evening gathering in mind.
  7321. A crowd of 100+ Python enthusiast filled the all hands area to share in some delicious catered
  7322. food and drinks as well as a bit of Python knowledge to boot.
  7323. The gathering was part of  &lt;a href=&quot;https://www.meetup.com/sfpython/&quot;&gt;SF Python Groups monthly Project Night&lt;/a&gt; event
  7324. which is held every 3rd Wednesday of the month.&lt;/p&gt;
  7325.  
  7326. &lt;p&gt;&lt;img src=&quot;/images/post_images/sfpython-patrick.jpg&quot; alt=&quot;sfpythonpatrick&quot; /&gt;&lt;/p&gt;
  7327.  
  7328. &lt;p&gt;The guests started trickling in around 6:00 pm just as the freshly baked Empanadas were
  7329. being served. The food and ice cold beverages were provided by AdRoll as gracious hosts.
  7330. The proceedings were kicked off by &lt;a href=&quot;https://www.meetup.com/sfpython/members/6993348/&quot;&gt;Grace Law&lt;/a&gt;, who’s one of the original founders of the
  7331. &lt;a href=&quot;https://www.meetup.com/sfpython/&quot;&gt;SF Python meetup group&lt;/a&gt;. Grace introduced &lt;a href=&quot;https://www.linkedin.com/in/patrickmee/&quot;&gt;Patrick Mee&lt;/a&gt;,
  7332. Head of Engineering at AdRoll,
  7333. who welcomed the growing crowd that had been networking while enjoying the delightful
  7334. food and drinks. Patrick highlighted the tech stacks used by the engineering teams at
  7335. AdRoll along with the reliance on Python as the glue to our various infrastructures.
  7336. The tutorials for the evening along with the AdRoll mentors were also introduced.&lt;/p&gt;
  7337.  
  7338. &lt;p&gt;Next came the lightning round where the floor was opened up to the audience for announcements.
  7339. Whether one was seeking a new job opportunity or looking for mentorship and collaborators for
  7340.  a private project, this was an opportunity for individuals from the crowd to come up and
  7341.  introduce themselves to the gathering. The AdRoll HR team also had a presence with a booth
  7342.  filled with cool AdRoll swag to give away.&lt;/p&gt;
  7343.  
  7344. &lt;p&gt;After the announcements had concluded, the guests dispersed to various parts of the all
  7345. hands area to attend the tutorials for the evening. Some decided to just hang around to
  7346. work on their projects and get help from the mentors. In order to accommodate the large
  7347. group of people interested in attending the two AdRoll employee led tutorials, our amazing
  7348. IT team made available the training rooms where everyone could comfortably sit.&lt;/p&gt;
  7349.  
  7350. &lt;p&gt;&lt;a href=&quot;https://www.linkedin.com/in/brianandrewweiser/&quot;&gt;Brian Weiser&lt;/a&gt; lead an engaging and informative tutorial on server-side rendering of react.
  7351. This tutorial was geared for attendees with intermediate level competency of Python.
  7352. Brian’s demo walked the audience step by step on how to convert a todo application
  7353.  written with React and a Flask backend to support server-side rendering.
  7354. &lt;img src=&quot;/images/post_images/sfpython-brian.jpg&quot; alt=&quot;sfpythonbrian&quot; /&gt;&lt;/p&gt;
  7355.  
  7356. &lt;p&gt;For those in the audience that were new to Python, &lt;a href=&quot;https://www.linkedin.com/in/aasif-shabbir-84a08643/&quot;&gt;I lead&lt;/a&gt; an introductory tutorial on Python
  7357.  and GraphQL. In the tutorial, I guided the audience on how to use the AdRoll GraphQL
  7358.  Reporting API endpoints to write their very first GraphQL query and execute it.
  7359. &lt;img src=&quot;/images/post_images/sfpython-aasif.jpg&quot; alt=&quot;sfpythonaasif&quot; /&gt;&lt;/p&gt;
  7360.  
  7361. &lt;p&gt;On a personal level, I felt quite honored to be given an opportunity to host a tutorial for
  7362. the SF Python meetup event. It’s been 4 years, though it felt like it was just yesterday,
  7363. that I was in the audience attending my first Python meet up with the SF Python meetup group.
  7364. Being able to share the knowledge I’ve gained over the years with the crowd was an amazing
  7365. feeling and I’m truly grateful for AdRoll for being generous in providing a venue for
  7366. the &lt;a href=&quot;https://www.meetup.com/sfpython/events/fmtsvnyxkbxb/&quot;&gt;Project Night event&lt;/a&gt;: but more importantly a place for people of varying skills
  7367. and knowledge to come together and share in ideas and learn from each other.&lt;/p&gt;
  7368.  
  7369. &lt;p&gt;Are you interested in working with Python and joining in on all the fun? Come &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/p&gt;
  7370.  
  7371. </description>
  7372.    </item>
  7373.    
  7374.    
  7375.    
  7376.    <item>
  7377.      <title>
  7378. TRecs: Declutter Recommendations by going Real-time
  7379. </title>
  7380.      <link>https://tech.nextroll.com/blog/adtech/2018/06/26/trecs-api.html</link>
  7381.      <pubDate>Tue, 26 Jun 2018 00:00:00 -0700</pubDate>
  7382.      <author></author>
  7383.      <guid isPermaLink="false">https://tech.nextroll.com/blog/adtech/2018/06/26/trecs-api</guid>
  7384.      <description>&lt;p&gt;As a performance-oriented growth platform, AdRoll is always looking for ways to improve our users’ ad experience. Our Dynamic ads are the highest performing ad type, but some of the underlying systems were becoming increasingly hard to scale and change. In this post, we will talk about how moving those systems to a real-time service made our product more nimble, scalable and easy to A/B test.&lt;/p&gt;
  7385.  
  7386. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;15-20 minute read&lt;/code&gt;&lt;/p&gt;
  7387.  
  7388. &lt;hr /&gt;
  7389.  
  7390. &lt;h1 id=&quot;high-level-problem&quot;&gt;High Level Problem&lt;/h1&gt;
  7391.  
  7392. &lt;p&gt;Our Dynamic Ads product is designed to show a user ads with specific, recommended products based on their browsing history. Each advertiser gives us a list of all of its products beforehand so that we can generate a “product catalog” for the advertiser. Using this product catalog and &lt;a href=&quot;http://tech.adroll.com/blog/adroll/2015/05/04/aws-summit-keynote.html&quot;&gt;AdRoll’s raw logs&lt;/a&gt;, we use algorithms to generate both per product and per advertiser recommendations daily via Amazon’s EMR service. Each ad is not associated to a single algorithm but to a combination of algorithms and weights called the &lt;em&gt;product stream&lt;/em&gt;.&lt;br /&gt;
  7393. So, in order to serve relevant recommendations to our users based on their browsing history, we had an offline processing job that precomputed recommendations for each permutation of every product stream, for all the products in our every advertiser’s product catalog daily (and advertisers can have millions of products). These pre-generated recommendations were stored in a dynamodb table and served via an akamai fronted endpoint.&lt;br /&gt;
  7394. While this process worked for awhile, we ran into both &lt;strong&gt;scaling&lt;/strong&gt; and &lt;strong&gt;ad performance&lt;/strong&gt; issues:&lt;/p&gt;
  7395.  
  7396. &lt;ol&gt;
  7397.  &lt;li&gt;Only 1% of the 600 million (3TB of data) precomputed recommendations stored in DynamoDB table  were being used on a daily basis.&lt;/li&gt;
  7398.  &lt;li&gt;The Javascript that loads at ad render time was responsible for getting recommendations from the DynamoDB table for every product that the user had interacted with. Because of the strict response time constraints, if the recommendations didn’t return in time and there weren’t enough products to animate the ad, a failsafe (static) ad was shown. Obviously, our customers who expect to see dynamic ads were not pleased that a percentage of their traffic was seeing static ads.&lt;/li&gt;
  7399.  &lt;li&gt;Our product recommendation A/B testing framework tests variations at a product stream level. Every time a product stream was created, new recommendations were generated for every product in our system for that product stream. Thus, every new experiment almost duplicated the entire data set, doubling Dynamodb writes everyday, thereby increasing costs and also deteriorating the table’s performance.&lt;/li&gt;
  7400. &lt;/ol&gt;
  7401.  
  7402. &lt;p&gt;Given these challenges, we determined that a real-time system that would generate &lt;em&gt;only&lt;/em&gt; the needed recommendations for each displayed ad made the most sense. We had to build a &lt;strong&gt;multi-region&lt;/strong&gt;, &lt;strong&gt;light weight&lt;/strong&gt;, &lt;strong&gt;highly available&lt;/strong&gt;, &lt;strong&gt;easily scalable service&lt;/strong&gt; with &lt;strong&gt;fast response time&lt;/strong&gt; (under 50ms) and a &lt;strong&gt;robust fallback mechanism&lt;/strong&gt;.&lt;/p&gt;
  7403.  
  7404. &lt;p&gt;In order to accomplish this task, we needed to do the following:&lt;/p&gt;
  7405. &lt;ul&gt;
  7406.  &lt;li&gt;Centralize all the recommendation metadata.&lt;/li&gt;
  7407.  &lt;li&gt;Replicate underlying data to all the regions to support a multi-region service.&lt;/li&gt;
  7408.  &lt;li&gt;Translate the offline processing job into a real-time API.&lt;/li&gt;
  7409.  &lt;li&gt;Deploy the new service in multiple regions.&lt;/li&gt;
  7410.  &lt;li&gt;Implement an effective load balancing solution.&lt;/li&gt;
  7411.  &lt;li&gt;Come up with a rollout plan to smoothly switch the production traffic to the new system and shutdown the old system.&lt;/li&gt;
  7412. &lt;/ul&gt;
  7413.  
  7414. &lt;h1 id=&quot;design&quot;&gt;Design&lt;/h1&gt;
  7415.  
  7416. &lt;p&gt;This diagram shows real-time recommendation system in one aws region.&lt;/p&gt;
  7417.  
  7418. &lt;p&gt;&lt;img src=&quot;/images/post_images/trecs/trecs_system.png&quot; alt=&quot;recommendation system&quot; /&gt;&lt;/p&gt;
  7419.  
  7420. &lt;h3 id=&quot;data-sources&quot;&gt;Data Sources&lt;/h3&gt;
  7421.  
  7422. &lt;p&gt;Our real-time recommendation service a.k.a &lt;strong&gt;TRecs&lt;/strong&gt; has a stateless retrieval API that fetches information from the underlying data sources in real-time to generate recommendations. In this section, we will go over the the data sources needed for building recommendations and the modifications we did to each of them to make this data easily consumable by trecs.&lt;/p&gt;
  7423.  
  7424. &lt;ol&gt;
  7425.  &lt;li&gt;
  7426.    &lt;p&gt;&lt;strong&gt;Recommendation metadata&lt;/strong&gt;&lt;br /&gt;
  7427. We have several EMR jobs in our system which are responsible for generating product recommendations for each of our recommendation algorithms. For legacy reasons, these EMR jobs wrote data to varied data sources like postgres, hbase. In order for the real-time API to work, we needed to aggregate all the algorithm data with a standardized scoring method in a low latency data store. &lt;br /&gt;
  7428. We were looking for a fast, scalable, consistent and cost effective data store. Elastic cache and Dynamodb were our top contenders. Given the recommendation metadata is generated by batch jobs, the writes were expected to be non-uniform. Since dynamodb costs depends on throughput rather than storage, using dynamodb with autoscaling was a more cost effective option than elastic cache. Thus we chose dynamodb as our data store.&lt;br /&gt;
  7429. We updated each EMR job to write the data to a new DynamoDB table where the key would be &lt;em&gt;advertiser_id/algorithm/source_product_id&lt;/em&gt; and the value would be the recommended products, represented as a comma separated string of &lt;em&gt;product_id&lt;/em&gt;’s with the first product listed being the most relevant recommendation. We leveraged the dynamodb’s &lt;em&gt;BatchWrite&lt;/em&gt; interface to make the writes as fast and efficient as possible.&lt;/p&gt;
  7430.  &lt;/li&gt;
  7431.  &lt;li&gt;
  7432.    &lt;p&gt;&lt;strong&gt;Product metadata&lt;/strong&gt;&lt;br /&gt;
  7433. Our product data pipeline already parsed our advertiser’s product catalogs and stored the product metadata in a Dynamodb table. Since the nature of writes for this datasource was also non-uniform, fetching data from dynamodb (with autoscaling enabled) was a high performant and cost effective option for TRecs. So, we didn’t have to tweak this data source.&lt;/p&gt;
  7434.  &lt;/li&gt;
  7435.  &lt;li&gt;
  7436.    &lt;p&gt;&lt;strong&gt;Cookie Interactions data&lt;/strong&gt;&lt;br /&gt;
  7437. Our Ad servers record cookie interactions in every aws region in a Dynamodb table. Thus, this data was readily available for use.&lt;/p&gt;
  7438.  &lt;/li&gt;
  7439.  &lt;li&gt;
  7440.    &lt;p&gt;&lt;strong&gt;Advertiser metadata&lt;/strong&gt;&lt;br /&gt;
  7441. TRecs needs advertiser meta information in real-time to filter out stale &amp;amp; blacklisted products. This data is present in RDS(postgres) and as you can imagine, it doesn’t update as frequently. So, we created a task that runs every quarter of an hour and dumps the relevant data from postgres onto a file in S3. This file is then loaded into the TRecs server memory once every 15 mins. In order to fetch this file with minimum latency from all TRecs regions, we configured Akamai to front the S3 bucket.&lt;/p&gt;
  7442.  &lt;/li&gt;
  7443. &lt;/ol&gt;
  7444.  
  7445. &lt;h3 id=&quot;data-replication&quot;&gt;Data Replication&lt;/h3&gt;
  7446.  
  7447. &lt;p&gt;Before TRecs service came into existence, most of our dynamic ad infrastructure (pipelines and data sources) was geo-located only in us-west AWS region and relied on Akamai’s CDN to serve product recommendations to our ads globally. In order to achieve fast response time with multi-region real-time recommendations service, the underlying data had to be available in all the aws regions where TRecs service was going to be hosted. Thus, we had to replicate &lt;em&gt;recommendation metadata&lt;/em&gt; and &lt;em&gt;product metadata&lt;/em&gt; to three other aws regions. Since Dynamodb was the underlying data store, we had to solve Dynamodb cross region replication problem.&lt;/p&gt;
  7448.  
  7449. &lt;p&gt;Replicating Dynamodb data is a two part process:&lt;/p&gt;
  7450. &lt;ol&gt;
  7451.  &lt;li&gt;
  7452.    &lt;p&gt;&lt;strong&gt;One time table copy&lt;/strong&gt;&lt;br /&gt;
  7453. We leveraged &lt;a href=&quot;https://aws.amazon.com/about-aws/whats-new/2013/09/12/announcing-dynamodb-cross-region-copy-feature-in-aws-data-pipeline/&quot;&gt;AWS Data Pipeline’s DynamoDB Cross-Region Copy&lt;/a&gt; feature for doing the one time table copy. It uses s3 as the intermittent storage to copy the data. Launching the EMR job in the same region as the destination table and increasing the (read/write) provisioned capacities on both source and destination tables enabled us to complete the one time table copy process faster.&lt;/p&gt;
  7454.  &lt;/li&gt;
  7455.  &lt;li&gt;
  7456.    &lt;p&gt;&lt;strong&gt;Setting up real-time updates&lt;/strong&gt;&lt;br /&gt;
  7457. Setting up real-time updates was non-trivial. Common idea was to apply live DynamoDB stream records from the source table to the destination table to get real-time updates. We had two options to leverage Dynamodb streams, either use an in-house developed solution or use &lt;a href=&quot;https://github.com/awslabs/dynamodb-cross-region-library&quot;&gt;AWSLabs Dynamodb cross region library&lt;/a&gt;. The in-house solution needed a few tweaks to be usable but the aws solution almost worked out of the box for us, so we decided to go with that.
  7458. While this system worked well with small test traffic, we saw a lot of write throttles on the destination tables when we rolled &lt;a href=&quot;https://github.com/awslabs/dynamodb-cross-region-library&quot;&gt;AWSLabs Dynamodb cross region library&lt;/a&gt; to production. Thereby ironically failing to replicate real-time updates. It turns out that Dynamodb streams don’t copy data at a uniform rate. Since both &lt;em&gt;product metadata&lt;/em&gt; and &lt;em&gt;recommendation metatadata&lt;/em&gt; source tables have bursty traffic and rely on autoscaling, the destination tables weren’t able to scale at the same rate as the source table causing records to be dropped.
  7459. Alas, we had to scratch the &lt;a href=&quot;https://github.com/awslabs/dynamodb-cross-region-library&quot;&gt;AWSLabs Dynamodb cross region library&lt;/a&gt; solution and modified our product pipelines/EMR jobs instead to write to Dynamodb tables in all the regions. Since our offline systems only exist in us-west region, we had to scale them up to absorb additional network latencies that were incurred while writing to Dynamodb tables in three other AWS regions.
  7460. AWS recently announced &lt;a href=&quot;https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GlobalTables.html&quot;&gt;global tables&lt;/a&gt; to solve for Dynamodb data replication problem. We are waiting for global tables to be available in the Asia Pacific regions to try it.&lt;/p&gt;
  7461.  &lt;/li&gt;
  7462. &lt;/ol&gt;
  7463.  
  7464. &lt;h3 id=&quot;real-time-recommendation-service-trecs&quot;&gt;Real-time recommendation service (TRecs)&lt;/h3&gt;
  7465.  
  7466. &lt;p&gt;Our Real-time Recommendation service is built using &lt;a href=&quot;http://www.dropwizard.io/1.2.2/docs/&quot;&gt;Dropwizard&lt;/a&gt;, a Java framework for RESTful web services. It runs within a docker container deployed on a standalone AWS ECS cluster. We use terraform templates to launch the underlying AWS infrastructure. It is deployed using jenkins2. We use datadog for montoring and logentries to collect logs. TRecs is integrated with JMX to collect jvm and jetty stats for monitoring and optimizing our environments.&lt;/p&gt;
  7467.  
  7468. &lt;h4 id=&quot;the-trecs-api&quot;&gt;The TRecs API&lt;/h4&gt;
  7469. &lt;p&gt;It’s a HTTP GET API which is called for all dynamic ads from Javascript at ad render time to fetch recommendations for a given user (cookie). The API does the following in real-time to generate recommendations:&lt;/p&gt;
  7470. &lt;ol&gt;
  7471.  &lt;li&gt;Fetches cookie data&lt;br /&gt;
  7472. The API gets a list of cookie interacted products from the &lt;em&gt;cookie metadata&lt;/em&gt; table.&lt;/li&gt;
  7473.  &lt;li&gt;Fetches recommendation metadata&lt;br /&gt;
  7474. As mentioned earlier, every dynamic ad is associated with a product stream. Each product stream has multiple recommendation sources associated with it and each recommendation source has a weight. The weight is used to determine how we should weigh various sources with respect to each other. Both per-product recommendations and per-advertiser recommendations are fetched from &lt;em&gt;recommendation metadata&lt;/em&gt; and a confidence (or score) is assigned to each recommendation.&lt;/li&gt;
  7475.  &lt;li&gt;Processes recommendation metadata&lt;br /&gt;
  7476. It’s very likely that the same product gets recommended by multiple sources, thus we need to “blend” those recommendations to come up with a normalized recommendation. Recommendations are then sorted on confidences and deduped.&lt;/li&gt;
  7477.  &lt;li&gt;Fetches product metadata  &lt;br /&gt;
  7478. For each of the recommended products, &lt;em&gt;product metadata&lt;/em&gt; is fetched from the Dynamodb table.&lt;/li&gt;
  7479.  &lt;li&gt;Filters products &lt;br /&gt;
  7480. We run a final check to filter out stale and blacklisted products before returning the list of recommendations to the dynamic ad.&lt;/li&gt;
  7481. &lt;/ol&gt;
  7482.  
  7483. &lt;p&gt;To reduce network latency, the API leverages Dynamodb’s &lt;a href=&quot;https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchGetItem.html&quot;&gt;BatchGetItem&lt;/a&gt; on every step of the way. Since each Dynamodb item is only a few bytes, a single batch call can fetch up to 100 items. This vastly improved our API’s response time.&lt;/p&gt;
  7484.  
  7485. &lt;h4 id=&quot;load-balancer&quot;&gt;Load balancer&lt;/h4&gt;
  7486. &lt;p&gt;TRecs API is fronted by an endpoint in the adserver. Since cookie values are only accessible from the same domain as they are set on, to get the adroll cookie, we had to route the dynamic ad traffic to TRecs via the adserver. &lt;br /&gt;
  7487. To scale the service horizontally, we were looking for a fast,reliable and cost effective load balancer for TRecs. It was mostly between choosing AWS ALB service or an in-house developed load balancing erl library. While AWS ALB worked well, we realized that AWS not only charges for keeping the ALB running but also charges for per unit traffic served by it. Since the recommendation service will experience high throughput, AWS ALB would have been a very pricy option.&lt;br /&gt;
  7488. Given, we were already forwarding the traffic to TRecs via the Erlang web server, it was fairly easy to integrate the in-house load balancing library in the same endpoint.&lt;/p&gt;
  7489.  
  7490. &lt;h4 id=&quot;fallback-mechanism&quot;&gt;Fallback mechanism&lt;/h4&gt;
  7491. &lt;p&gt;Coming from a system where all the recommendations were precomputed, Dynamic ads  never experienced missing recommendations due to system outages. To ensure the dynamic ad performance remains unaffected during request failures / system outages, It was very important to have a robust fallback mechanism with reasonable response times. We deployed a “slim” version of the precomputed recommendations offline job to generate only per advertiser recommendations and store them in S3. We use a Akamai fronted s3 endpoint to fetch these recommendations to save us from replicating them across AWS regions while allowing faster response times.&lt;/p&gt;
  7492.  
  7493. &lt;h1 id=&quot;rollout-plan&quot;&gt;Rollout plan&lt;/h1&gt;
  7494.  
  7495. &lt;p&gt;We did the following to ensure a smooth switch to the new recommendation system -&lt;/p&gt;
  7496. &lt;ul&gt;
  7497.  &lt;li&gt;Performed multiple load tests to make sure the TRecs service lives up to the throughput and response time SLA’s.&lt;/li&gt;
  7498.  &lt;li&gt;Turned on data replication to replicate underlying data sources to three other AWS regions&lt;/li&gt;
  7499.  &lt;li&gt;Deployed tasks to dump the needed data in S3&lt;/li&gt;
  7500.  &lt;li&gt;To rollout the traffic to our new recommendation system incrementally, we leveraged our recommendation A/B testing framework.&lt;/li&gt;
  7501.  &lt;li&gt;While scaling up the traffic to TRecs, we used presto to query our &lt;a href=&quot;http://tech.adroll.com/blog/adroll/2015/05/04/aws-summit-keynote.html&quot;&gt;raw logs&lt;/a&gt; to validate the ad recommendation performance.&lt;/li&gt;
  7502.  &lt;li&gt;Refactored the javasccript code that runs at ad render time to use TRecs as the primary recommendation service&lt;/li&gt;
  7503.  &lt;li&gt;Migrated other systems at adroll to use TRecs&lt;/li&gt;
  7504.  &lt;li&gt;Sunset the old recommendation system&lt;/li&gt;
  7505. &lt;/ul&gt;
  7506.  
  7507. &lt;h1 id=&quot;performance&quot;&gt;Performance&lt;/h1&gt;
  7508. &lt;p&gt;We are very excited to share the improvements of our Real-time system&lt;/p&gt;
  7509.  
  7510. &lt;p&gt;&lt;strong&gt;Fast response time&lt;/strong&gt;&lt;br /&gt;
  7511. TRecs API has a median response time of ~30ms and can support over 1 million requests per host per hour
  7512. &lt;img src=&quot;/images/post_images/trecs/trecs_stats.png&quot; alt=&quot;API response time&quot; /&gt;&lt;/p&gt;
  7513.  
  7514. &lt;p&gt;&lt;strong&gt;Easy to scale&lt;/strong&gt;&lt;br /&gt;
  7515. It is deployed as a ECS service which is easy to scale horizontally&lt;/p&gt;
  7516.  
  7517. &lt;p&gt;&lt;strong&gt;Better product recommendations&lt;/strong&gt;&lt;br /&gt;
  7518. The recommendations shown in a dynamic ad is a combination of user interacted product recommendations and top products for that advertiser. Since TRecs filters stale products in real-time, our dynamic ads saw a 10% increase in user interacted recommendations. Showing more relevant recommendations leads to better ad performance.&lt;/p&gt;
  7519.  
  7520. &lt;p&gt;&lt;strong&gt;No additional costs to A/B test new recommendation algorithms&lt;/strong&gt;&lt;br /&gt;
  7521. We don’t have to precompute recommendations to test new algorithms. This has unlocked the opportunity to test and rollout new algorithms which will help improve dynamic ad recommendations&lt;/p&gt;
  7522.  
  7523. &lt;p&gt;&lt;strong&gt;Fewer network calls to fetch recommendations&lt;/strong&gt;&lt;br /&gt;
  7524. The dynamic Ad has to make one network call to fetch all the recommendations, making the ads respond faster&lt;/p&gt;
  7525.  
  7526. &lt;p&gt;&lt;strong&gt;Easy to troubleshoot&lt;/strong&gt;&lt;br /&gt;
  7527. Previously the recommendation functionality was split between the offline recommendation job and the Javascript ad render component, making it difficult to troubleshoot recommendations served for a given advertiser, cookie combination. But with TRecs, debugging such issues has become a lot easier as all the functionality is wrapped in a single API.&lt;/p&gt;
  7528.  
  7529. &lt;p&gt;It has been fun working on this project. We hope this post is helpful for people who are looking to transform their offline systems to real-time services using AWS services. This project is a good illustration of the type of work we do on the Dynamic Ads team. If this piques your interest &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;&lt;/p&gt;
  7530.  
  7531. </description>
  7532.    </item>
  7533.    
  7534.    
  7535.    
  7536.    <item>
  7537.      <title>
  7538. Data Driven Women… and Allies!
  7539. </title>
  7540.      <link>https://tech.nextroll.com/blog/culture/2018/05/17/ddw.html</link>
  7541.      <pubDate>Thu, 17 May 2018 00:00:00 -0700</pubDate>
  7542.      <author></author>
  7543.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2018/05/17/ddw</guid>
  7544.      <description>&lt;p&gt;Recently I attended the &lt;a href=&quot;https://www.meetup.com/Data-Driven-Women/events/249711184/&quot;&gt;Data Driven Women Meetup event&lt;/a&gt; hosted by AdRoll Group, and felt inspired to share this blog post,
  7545. from a man’s perspective. I heard of the event through my work (AdRoll Group) and after reading up on the event details,
  7546. I realized four of my women colleagues were the featured speakers. It was at that time I remember thinking,
  7547.  “I should really go to support them.”&lt;/p&gt;
  7548.  
  7549. &lt;p&gt;Admittedly, I wasn’t even sure if men were allowed to attend, thinking it might just be a women only thing… wrong! I
  7550. reached out to one of my colleagues just to make sure and she quickly debunked my ill conceived thought process by
  7551. letting me know that men were not only invited, but encouraged to attend.&lt;/p&gt;
  7552.  
  7553. &lt;p&gt;The event started out like most meetups with music, drinks, snacks, and networking. There was a very inviting feel to
  7554. the room, with couches and tables set up to create a comfortable environment where people could sit, talk and listen
  7555. to the featured speakers. The space filled up quickly and I noticed that I was not the only man in room; there were
  7556. also quite a few of our male coworkers in attendance. During the
  7557. networking portion of the evening, there were working
  7558. sessions where attendees could get tips on crafting their resumes, interviewing, and other career advice from a wide
  7559.  range of women (and men) in varying levels of seniority from different companies. These sesions were sponsored by
  7560.  &lt;a href=&quot;http://albertslist.org/&quot;&gt;Alberts List&lt;/a&gt;, &lt;a href=&quot;http://cybersn.com&quot;&gt;cybersn&lt;/a&gt; and &lt;a href=&quot;https://hirepool.io/&quot;&gt;hirepool&lt;/a&gt;.&lt;/p&gt;
  7561.  
  7562. &lt;p&gt;The featured speakers for the evening included a few of the female leaders on the Product and Engineering teams here
  7563. at AdRoll Group:&lt;/p&gt;
  7564.  
  7565. &lt;h3 id=&quot;kelly-eng-senior-product-manager&quot;&gt;Kelly Eng, Senior Product Manager&lt;/h3&gt;
  7566. &lt;p&gt;&lt;img alt=&quot;Kelly&quot; src=&quot;/images/post_images/post_ddw/Kelly_Eng.png&quot; style=&quot;max-width: 300px&quot; /&gt;&lt;/p&gt;
  7567.  
  7568. &lt;h3 id=&quot;julie-zhou-director-of-product&quot;&gt;Julie Zhou, Director of Product&lt;/h3&gt;
  7569. &lt;p&gt;&lt;img alt=&quot;Julie&quot; src=&quot;/images/post_images/post_ddw/Julie.png&quot; style=&quot;max-width: 300px&quot; /&gt;&lt;/p&gt;
  7570.  
  7571. &lt;h3 id=&quot;jessica-grist-senior-software-engineer&quot;&gt;Jessica Grist, Senior Software Engineer&lt;/h3&gt;
  7572. &lt;p&gt;&lt;img alt=&quot;Jessica&quot; src=&quot;/images/post_images/post_ddw/Jessica.jpg&quot; style=&quot;max-width: 300px&quot; /&gt;&lt;/p&gt;
  7573.  
  7574. &lt;h3 id=&quot;miriam-pena-staff-engineer&quot;&gt;Miriam Pena, Staff Engineer&lt;/h3&gt;
  7575. &lt;p&gt;&lt;img alt=&quot;Miriam&quot; src=&quot;/images/post_images/post_ddw/MiriamPena.jpg&quot; style=&quot;max-width: 300px&quot; /&gt;&lt;/p&gt;
  7576.  
  7577. &lt;p&gt;Throughout the evening each speaker shared her own unique story and experience of how she broke into her career and current role, amidst what is still today a male-dominated industry. As each of my coworkers told their story, it helped me to understand what motivated them to pursue their own career in tech, the approaches and paths they each took to navigate the course, as well as the personal and political challenges they faced along the way. Although each experience and path was different, I noticed a few themes emerge.&lt;/p&gt;
  7578.  
  7579. &lt;p&gt;&lt;strong&gt;1.)Overcoming fear&lt;/strong&gt;- Whether it was fear of changing careers later in life, stepping into a lead role for the first time, applying for a position with non-traditional experience, or even public speaking, it was inspiring to hear each speaker talk openly about their own fears, which as a society we often consider to be vulnerability or weakness. It’s only after we expose our fears for what they are that we can then start to make strides towards overcoming them to become stronger and more confident.&lt;/p&gt;
  7580.  
  7581. &lt;p&gt;&lt;strong&gt;2.)The importance of aligning yourself with the right mentors and/or allies&lt;/strong&gt; - This could come in the shape of an all female coding bootcamp, a strength training coach, a male coworker willing to push you, or retracing the steps of early engineering languages to find that strong females were a big part of the early history of tech.&lt;/p&gt;
  7582.  
  7583. &lt;p&gt;&lt;strong&gt;3.)The importance of using your voice to help other women&lt;/strong&gt; - Not only women currently or soon to be in the job market, but also the young women of the world who are dreaming about what they want to be when they grow up. We all know the power and encouragement that comes with being able to self-identify with someone who is accomplished in a particular field of interest. It’s infectious! Point being, we need more female leaders in tech to help show the future generations of ambitious women that tech is a career they, too, can pursue.&lt;/p&gt;
  7584.  
  7585. &lt;p&gt;I’ve always had a deep rooted respect and support for women, but after attending this event I have gained a newfound appreciation for not only my coworkers, but also the need for further support from male allies and industry leaders to make sure the tech industry is a place where people from all backgrounds feel welcome.&lt;/p&gt;
  7586.  
  7587. &lt;p&gt;You can watch the recording here:&lt;/p&gt;
  7588. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/wQsNWxoS92s&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7589.  
  7590. &lt;p&gt;&lt;strong&gt;Are you interested in working with a team of enthusiastic developers in a remote-friendly globally distributed team? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  7591.  
  7592. </description>
  7593.    </item>
  7594.    
  7595.    
  7596.    
  7597.    <item>
  7598.      <title>
  7599. BuzzConf BA 2018: A Conference for Developers by Developers
  7600. </title>
  7601.      <link>https://tech.nextroll.com/blog/dev/2018/05/08/buzzconf-2018.html</link>
  7602.      <pubDate>Tue, 08 May 2018 00:00:00 -0700</pubDate>
  7603.      <author></author>
  7604.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2018/05/08/buzzconf-2018</guid>
  7605.      <description>&lt;p&gt;Highlights from the BuzzConf BA 2018 conference. April 26th, 2018&lt;/p&gt;
  7606.  
  7607. &lt;hr /&gt;
  7608.  
  7609. &lt;h1 id=&quot;buzzwords-as-the-motivation-for-a-conference&quot;&gt;Buzzwords as the Motivation for a Conference&lt;/h1&gt;
  7610. &lt;p&gt;As google promptly help us define, &lt;em&gt;buzzword&lt;/em&gt; is…&lt;/p&gt;
  7611.  
  7612. &lt;blockquote&gt;
  7613.  &lt;p&gt;a word or phrase, often an item of jargon, that is fashionable at a particular time or in a particular context.&lt;/p&gt;
  7614. &lt;/blockquote&gt;
  7615.  
  7616. &lt;p&gt;In the world of software, these words come and go all the time. While they’re &lt;em&gt;on the spot&lt;/em&gt;, everbody talks about them.
  7617. That means the amount of information spread with everyone talking about them is minimal. You get lost in so many articles, videos, tweets, etc. and it’s hard for anybody to really understand the subject and learn about it.&lt;/p&gt;
  7618.  
  7619. &lt;p&gt;That’s where &lt;a href=&quot;http://buzzconf.org/&quot;&gt;BuzzConf&lt;/a&gt; comes in. The organizers (including &lt;a href=&quot;https://twitter.com/iraunkortasuna&quot;&gt;Iñaki Garay&lt;/a&gt; and &lt;a href=&quot;http://twitter.com/folano_&quot;&gt;Facundo Olano&lt;/a&gt; from our Supply Engineer Team) created a conference that, in one day, covered most of the buzz-word-worthy topics that are around today. We had talks about Space, Cryptocurrencies, Diversity &amp;amp; Inclusion, Functional Programming, Machine Learning, Video Games, AI, Distributed Systems, Rust, Elixir and others. Each talk was delivered by someone that is actually an expert in the field and they all were everything but superficial.&lt;/p&gt;
  7620.  
  7621. &lt;h2 id=&quot;the-minimalistic-conference&quot;&gt;The Minimalistic Conference&lt;/h2&gt;
  7622. &lt;p&gt;BuzzConf was a one-track conference, with 12 talks in quick succession. There ware no fancy introductions, no conference t-shirts, barely one or two banners with sponsor logos, no contests, no social media promotion, etc. The organizers specifically avoided all the superfluous stuff that abounds in other venues to focus on the most important parts:&lt;/p&gt;
  7623.  
  7624. &lt;ul&gt;
  7625.  &lt;li&gt;Great speakers, &lt;em&gt;checked!&lt;/em&gt;&lt;/li&gt;
  7626.  &lt;li&gt;High quality content, &lt;em&gt;checked!&lt;/em&gt;&lt;/li&gt;
  7627.  &lt;li&gt;Space and time for socializing and networking, &lt;em&gt;checked!&lt;/em&gt;&lt;/li&gt;
  7628.  &lt;li&gt;Cheap tickets so that anybody can join, &lt;em&gt;checked!&lt;/em&gt;&lt;/li&gt;
  7629.  &lt;li&gt;As much learning as you can fit in a day, &lt;em&gt;checked checked checked!&lt;/em&gt;&lt;/li&gt;
  7630. &lt;/ul&gt;
  7631.  
  7632. &lt;p&gt;For a conference with BuzzWords as its main subject, I think they did an amazing job trimming away all the noise and providing as much signal as they could.&lt;/p&gt;
  7633.  
  7634. &lt;h2 id=&quot;highlights&quot;&gt;Highlights&lt;/h2&gt;
  7635. &lt;p&gt;From the 12 talks, I’ll briefly describe my favorites below. I encourage you to check the rest of them on &lt;a href=&quot;https://github.com/lambdaclass/buzzconf/tree/master/slides&quot;&gt;BuzzConf’s GitHub Repo&lt;/a&gt;.&lt;/p&gt;
  7636.  
  7637. &lt;h3 id=&quot;cryptocurrencies--blockchain&quot;&gt;Cryptocurrencies / BlockChain&lt;/h3&gt;
  7638. &lt;p&gt;We had two talks on the subject of crypto currencies, blockchain and decentralized apps.
  7639. I particularly enjoyed the talk by &lt;a href=&quot;http://twitter.com/smpalladino&quot;&gt;Santiago Palladino&lt;/a&gt; (from &lt;a href=&quot;http://zpl.in&quot;&gt;Zeppelin&lt;/a&gt;) in which he presented many applications of blockchain mechanisms (with examples from Ethereum) in a very objective way, providing pros and cons and paths to move forward. He managed to pack a lot of knowledge in a relatively short time and left us all wondering what else is out there for us to get our hands on.
  7640. &lt;a href=&quot;https://github.com/lambdaclass/buzzconf/blob/master/slides/02.%20Santiago%20Palladino/BuzzConf%202018%20-%20Crypto%20beyond%20currencies%20%5BShared%5D.pdf&quot;&gt;&lt;img src=&quot;/images/post_images/buzzconf-thumbnail-santi.png&quot; alt=&quot;santi-thumbnail&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
  7641.  
  7642. &lt;h3 id=&quot;diversity--inclusion&quot;&gt;Diversity &amp;amp; Inclusion&lt;/h3&gt;
  7643. &lt;p&gt;&lt;a href=&quot;www.dianamaffia.com.ar&quot;&gt;Diana Maffía&lt;/a&gt; gave a very enthusiastic talk about the role of women in science both in Argentina and the world, highlighting the work that &lt;a href=&quot;http://www.ragcyt.org.ar/&quot;&gt;ragcyt&lt;/a&gt; (Argentinian Network about Gender, Science and Technology) has been doing over the last decades (they started in 1999). One of their goals is to &lt;em&gt;objectively&lt;/em&gt; understand the current situation and challenges that women face when attempting to pursue a career in science (particularly in Argentina) and provide advice and guidance for everyone trying to build a more inclusive world.
  7644. &lt;a href=&quot;https://github.com/lambdaclass/buzzconf/blob/master/slides/04.%20Diana%20Maff%C3%ADa/buzz%20conf-%20abril%202018%20%5BModo%20de%20compatibilidad%5D.pdf&quot;&gt;&lt;img src=&quot;/images/post_images/buzzconf-thumbnail-diana.png&quot; alt=&quot;diana-thumbnail&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
  7645.  
  7646. &lt;h3 id=&quot;video-games&quot;&gt;Video Games&lt;/h3&gt;
  7647. &lt;p&gt;Even at a conference focused on new and emerging topics, we still had time to talk about the usual suspects. We had a talk about Linux Kernel and also an unexpectedly great talk about video games. In this talk, &lt;a href=&quot;https://twitter.com/reduzio&quot;&gt;Juan Linietsky&lt;/a&gt; presented his framework: &lt;a href=&quot;https://godotengine.org/&quot;&gt;Godot Engine&lt;/a&gt;. &lt;em&gt;Godot&lt;/em&gt; is a very advanced video game framework that allows you to develop 2D and 3D games with a lot of flexibility. It’s focused on teamwork and, the most important part: &lt;strong&gt;It’s 100% open-source!&lt;/strong&gt;.
  7648. &lt;a href=&quot;https://github.com/lambdaclass/buzzconf/blob/master/slides/08.%20Juan%20Linietsky/Godot%20Engine%20-%20Introducci%C3%B3n%20Informativa.pdf&quot;&gt;&lt;img src=&quot;/images/post_images/buzzconf-thumbnail-juan.png&quot; alt=&quot;juan-thumbnail&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
  7649.  
  7650. &lt;h3 id=&quot;property-based-testing&quot;&gt;Property-Based Testing&lt;/h3&gt;
  7651. &lt;p&gt;With examples in &lt;a href=&quot;https://elixir-lang.org/&quot;&gt;Elixir&lt;/a&gt;, but a presentation that spoke to everyone regarding of their favorite language, &lt;a href=&quot;https://twitter.com/whatyouhide&quot;&gt;Andrea Leopardi&lt;/a&gt; introduced the audience to and promoted the usage of &lt;a href=&quot;https://elixir-lang.org/blog/2017/10/31/stream-data-property-based-testing-and-data-generation-for-elixir/&quot;&gt;Property-Based Testing&lt;/a&gt;. This is not a new concept for some of us, but the amount of work in this area in the last few years is astonishing. In his talk, Andrea presented the basics of this technique and also showed many of the latest developments and future plans for Elixir in particular and property based testing in general. It was &lt;em&gt;inspiring&lt;/em&gt;.
  7652. &lt;script async=&quot;&quot; class=&quot;speakerdeck-embed&quot; data-id=&quot;2be1ee1eb1c449da997ac515622bcbde&quot; data-ratio=&quot;1.77777777777778&quot; src=&quot;//speakerdeck.com/assets/embed.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;
  7653.  
  7654. &lt;h3 id=&quot;distributed-systems&quot;&gt;Distributed Systems&lt;/h3&gt;
  7655. &lt;p&gt;After the last break, everyone was ready to enjoy the closing keynotes. And we were not dissappointed! In the first one, &lt;a href=&quot;http://christophermeiklejohn.com&quot;&gt;Christopher Meiklejohn&lt;/a&gt; gave us a crash course into distributed systems and showed some of the most important challenges of this area today. Then he presented some proposed solutions to those challenges, involving tools like &lt;a href=&quot;https://github.com/lasp-lang/partisan&quot;&gt;Partisan&lt;/a&gt; and his work on CRDTs. It was a very interesting talk that left us with a bright vision of the future of distributed systems, particularly those written in &lt;a href=&quot;http://www.erlang.org&quot;&gt;Erlang&lt;/a&gt;.
  7656. &lt;a href=&quot;https://github.com/lambdaclass/buzzconf/blob/master/slides/11.%20Christopher%20Meiklejohn/Keynote%20-%20Buzzconf.pptx?raw=true&quot;&gt;&lt;img src=&quot;/images/post_images/buzzconf-thumbnail-chris.png&quot; alt=&quot;chris-thumbnail&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
  7657.  
  7658. &lt;h3 id=&quot;rust&quot;&gt;Rust&lt;/h3&gt;
  7659. &lt;p&gt;Finally, the last talk of the day was delivered by the always amazing &lt;a href=&quot;steveklabnik.com&quot;&gt;Steve Klabnik&lt;/a&gt;. Of course, his talk was about &lt;a href=&quot;https://www.rust-lang.org/&quot;&gt;Rust&lt;/a&gt;, but it was certainly not your usual introductory &lt;em&gt;hello-world&lt;/em&gt; talk. While showcasing the main features of the language he created, he also showed some advanced concepts that left many of us looking for excuses to use this fantastic language in some of our upcoming projects.
  7660. &lt;a href=&quot;https://github.com/lambdaclass/buzzconf/tree/master/slides/12.%20Steve%20Klabnik&quot;&gt;&lt;img src=&quot;/images/post_images/buzzconf-thumbnail-steve.png&quot; alt=&quot;steve-thumbnail&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
  7661.  
  7662. &lt;h2 id=&quot;last-thoughts&quot;&gt;Last thoughts&lt;/h2&gt;
  7663. &lt;p&gt;For people in remote parts of the southern hemisphere, away from the places were most of the greatest conferences are organized every year, being able to attend a conference with high quality speakers giving great talks on relevant subjects is something unheard of.&lt;/p&gt;
  7664.  
  7665. &lt;p&gt;Some of us, we have the fortune of working remotely for a company like AdRoll that allows us to travel up north and enjoy conferences like &lt;a href=&quot;/blog/dev/2018/04/12/codebeam-sf-2018.html&quot;&gt;CodeBEAM&lt;/a&gt;. Yet everybody here would love to have more conferences and events in our home towns. That’s why we decided to take matters into our own hands and organize our own events, either &lt;a href=&quot;http://www.erlang-factory.com/eflba2017&quot;&gt;in person&lt;/a&gt; or &lt;a href=&quot;http://spawnfest.github.io&quot;&gt;online&lt;/a&gt;. This movement is growing and the community is growing with it. With accessible conferences like BuzzConf that give you a lot of great content for a more than fair price, everybody wins!&lt;/p&gt;
  7666.  
  7667. &lt;p&gt;&lt;strong&gt;Are you interested in working with a team of enthusiastic developers in a remote-friendly globally distributed team? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  7668.  
  7669. </description>
  7670.    </item>
  7671.    
  7672.    
  7673.    
  7674.    <item>
  7675.      <title>
  7676. Making 1M Click Predictions per Second using AWS
  7677. </title>
  7678.      <link>https://tech.nextroll.com/blog/data-science/2018/04/26/just-binary-classifier.html</link>
  7679.      <pubDate>Thu, 26 Apr 2018 00:00:00 -0700</pubDate>
  7680.      <author></author>
  7681.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data-science/2018/04/26/just-binary-classifier</guid>
  7682.      <description>&lt;p&gt;Click prediction may be a simple binary classification problem, but it requires a robust system architecture to
  7683. function in production at scale. At AdRoll, we leverage the AWS ecosystem along with a suite of third party tools to build the predictors that
  7684. power our pricing engine, BidIQ.
  7685. This is a tour of the production pipelines and monitoring systems that keep BidIQ running.&lt;/p&gt;
  7686.  
  7687. &lt;hr /&gt;
  7688.  
  7689. &lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
  7690.  
  7691. &lt;p&gt;AdRoll participates on &lt;a href=&quot;https://en.wikipedia.org/w/index.php?title=Real-time_bidding&amp;amp;oldid=836020656&quot;&gt;Real Time Bidding (RTB)&lt;/a&gt; exchanges to display ads for our advertisers. When a user visits a website,
  7692. a real-time auction takes place while the page loads. AdRoll submits an ad and a bid price, and if our bid wins, our ad is displayed.&lt;/p&gt;
  7693.  
  7694. &lt;p&gt;In more detail, AdRoll has a number of machines we call “bidders,” which are integrated to the various ad exchanges which run RTB auctions.
  7695. These exchanges send “bid requests” to our bidders. For each bid request, our bidders determine a list of ads we might show and
  7696. for each ad, our bidders query &lt;a href=&quot;https://blog.adroll.com/product/reintroducing-bidiq-the-intelligence-that-powers-adroll-campaigns&quot;&gt;BidIQ&lt;/a&gt;, our pricing engine, to determine what price we are willing to bid for it. Then the bidders take the ad with the highest bid price and send it to the exchange.&lt;/p&gt;
  7697.  
  7698. &lt;p&gt;&lt;img src=&quot;/images/post_images/just_binary_classifier/auction.png&quot; alt=&quot;Auction Flow&quot; /&gt;&lt;/p&gt;
  7699.  
  7700. &lt;p&gt;The Data Science Engineering team works on &lt;a href=&quot;https://blog.adroll.com/product/reintroducing-bidiq-the-intelligence-that-powers-adroll-campaigns&quot;&gt;BidIQ&lt;/a&gt; and the loose coupling between the bidders and BidIQ gives us complete flexibility in how we price each ad
  7701. and what infrastructure we use. BidIQ determines the bid price for an ad based on the advertiser’s campaign goals and a series of models
  7702. that predict the probabilities that the ad will be viewable, that the user will click our ad, and that the user
  7703. will go on to take an “action” (e.g. buy something), among others.&lt;/p&gt;
  7704.  
  7705. &lt;p&gt;In this blog post I’ll focus on BidIQ’s click predictor, which we built first and have used the longest. When I tell people familiar with machine learning
  7706. about it, they are often skeptical that there is much to do: “Isn’t that just a binary classifier?”&lt;/p&gt;
  7707.  
  7708. &lt;p&gt;Indeed click prediction is a classic binary classification problem.
  7709. We can build a dataset with one line for each impression (displayed advertisement), labeled &lt;strong&gt;1&lt;/strong&gt; or &lt;strong&gt;0&lt;/strong&gt; according to whether it was
  7710. clicked or not, along with a list of features (time of day, geographic location, etc.) for that impression. Then for a fresh impression, our problem is to predict
  7711. how likely it is to be &lt;strong&gt;1&lt;/strong&gt; based on its features.&lt;/p&gt;
  7712.  
  7713. &lt;p&gt;&lt;img src=&quot;/images/post_images/just_binary_classifier/dataset.png&quot; alt=&quot;Dataset&quot; /&gt;&lt;/p&gt;
  7714.  
  7715. &lt;p&gt;Now standard machine learning approaches apply. We can split the dataset into a train and test set, choose a subset of features, train a logistic
  7716. regression on the train set, evaluate it on the test set, and repeat with different subsets of features until we find the subset
  7717. which minimizes the average log-loss on the test set. Then, after training a model with the selected subset of features
  7718. over the whole set of data, we can use it to make click predictions for future ads.&lt;/p&gt;
  7719.  
  7720. &lt;p&gt;Were this a textbook exercise, we’d be done, yet we are just getting started. This is where the “Engineering” in Data Science Engineering comes in.&lt;/p&gt;
  7721.  
  7722. &lt;p&gt;For our production system we need to:&lt;/p&gt;
  7723. &lt;ul&gt;
  7724.  &lt;li&gt;Automate everything: training set generation, model training, model deployment, and updates&lt;/li&gt;
  7725.  &lt;li&gt;Update models frequently, and with &lt;em&gt;zero&lt;/em&gt; downtime; we need to price ads 24/7&lt;/li&gt;
  7726.  &lt;li&gt;Handle a throughput of one million queries per second, responding to each with less than 20 milliseconds of latency&lt;/li&gt;
  7727.  &lt;li&gt;Facilitate model improvements and feature discovery by our engineers in parallel&lt;/li&gt;
  7728.  &lt;li&gt;Allow multiple live A/B tests of different models&lt;/li&gt;
  7729.  &lt;li&gt;Monitor our production pipeline carefully as well as the inputs and outputs of the model&lt;/li&gt;
  7730. &lt;/ul&gt;
  7731.  
  7732. &lt;p&gt;To build our system architecture satisfying these requirements, we use the AWS ecosystem heavily as well as a variety
  7733. of third party and open source tools. A list of these tools and their descriptions is in the &lt;a href=&quot;#appendix&quot;&gt;Appendix&lt;/a&gt;.&lt;/p&gt;
  7734.  
  7735. &lt;h1 id=&quot;training-the-model&quot;&gt;Training The Model&lt;/h1&gt;
  7736.  
  7737. &lt;p&gt;There are three steps in our training pipeline:&lt;/p&gt;
  7738. &lt;ol&gt;
  7739.  &lt;li&gt;Generate the training dataset&lt;/li&gt;
  7740.  &lt;li&gt;Train a model against the training dataset&lt;/li&gt;
  7741.  &lt;li&gt;Wrap the model with serving code in a docker image for easy deployment&lt;/li&gt;
  7742. &lt;/ol&gt;
  7743.  
  7744. &lt;p&gt;We handle these steps with &lt;a href=&quot;https://www.docker.com/&quot;&gt;Docker&lt;/a&gt;, &lt;a href=&quot;https://aws.amazon.com/batch/&quot;&gt;AWS Batch&lt;/a&gt;, and &lt;a href=&quot;https://github.com/spotify/luigi&quot;&gt;Luigi&lt;/a&gt;.&lt;/p&gt;
  7745.  
  7746. &lt;p&gt;&lt;img src=&quot;/images/post_images/just_binary_classifier/pipeline.png&quot; alt=&quot;Pipeline&quot; /&gt;&lt;/p&gt;
  7747.  
  7748. &lt;p&gt;Our starting point is the impression and click logs our bidders generate and store on &lt;a href=&quot;https://aws.amazon.com/s3/&quot;&gt;S3&lt;/a&gt;.&lt;/p&gt;
  7749.  
  7750. &lt;p&gt;For step one, we’ve built a Docker image which takes a time range, downloads
  7751. the impression and click logs in that time range, joins them, and uploads the resultant dataset to S3.&lt;/p&gt;
  7752.  
  7753. &lt;p&gt;To generate the model, we’ve built a Docker image which takes a time range
  7754. and model configuration, downloads the training dataset in that time range, trains a model against the dataset, and uploads it to S3.&lt;/p&gt;
  7755.  
  7756. &lt;p&gt;Finally, we download the model, combine it
  7757. with serving code in a Docker image, and upload it to our &lt;a href=&quot;https://aws.amazon.com/ecr/&quot;&gt;ECR&lt;/a&gt; repository. This final image can then be easily deployed to any &lt;a href=&quot;https://aws.amazon.com/ec2/&quot;&gt;EC2&lt;/a&gt; instance where it can price queries on a
  7758. specified port.&lt;/p&gt;
  7759.  
  7760. &lt;p&gt;These tasks are placed in a Luigi pipeline and kicked off via AWS Batch when the upstream requirements are satisfied. This ensures that
  7761. freshly generated logs are joined and incorporated into our most recent model with minimal delay and computing resources.&lt;/p&gt;
  7762.  
  7763. &lt;h1 id=&quot;deploying-and-updating-the-model&quot;&gt;Deploying and Updating the Model&lt;/h1&gt;
  7764.  
  7765. &lt;h2 id=&quot;model-deployment&quot;&gt;Model Deployment&lt;/h2&gt;
  7766.  
  7767. &lt;p&gt;AdRoll displays ads to users all over the world. Given the tight latency restrictions of RTB, we’ve placed our bidders all over the world as well.
  7768. More concretely, our bidders run on EC2 instances in multiple AWS regions. Thus, in order to keep the latency between the
  7769. bidders and BidIQ under 20ms, we also need to deploy BidIQ servers in each AWS region. Furthermore, in each region we may need BidIQ servers
  7770. for multiple versions of the model (e.g. for A/B tests).&lt;/p&gt;
  7771.  
  7772. &lt;p&gt;Architecturally, we handle this by having one &lt;a href=&quot;https://aws.amazon.com/autoscaling/&quot;&gt;Auto Scaling Group (ASG)&lt;/a&gt; per (BidIQ Version, AWS Region) pair. Then, by changing the size of the ASG, we
  7773. can trivially scale up or down the throughput of ad auctions that BidIQ can price. For example, we can scale down BidIQ versions which are only tested on a small percentage
  7774. of traffic or scale up heavily trafficked AWS regions.&lt;/p&gt;
  7775.  
  7776. &lt;p&gt;We use &lt;a href=&quot;https://www.terraform.io/&quot;&gt;Terraform&lt;/a&gt; to manage the deployment of the BidIQ ASGs and store the generated &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.tfstate&lt;/code&gt; files on S3, sharded by BidIQ version and AWS region. In this way, any engineer
  7777. can easily deploy or modify the ASG for a given (BidIQ Version, AWS Region) with a simple &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;terraform apply&lt;/code&gt;.&lt;/p&gt;
  7778.  
  7779. &lt;h2 id=&quot;service-discovery&quot;&gt;Service Discovery&lt;/h2&gt;
  7780.  
  7781. &lt;p&gt;So now we have BidIQ ASGs serving the model, and bidder instances which need to query the model. How can the bidders know which host and port
  7782. to query to reach a given BidIQ version?&lt;/p&gt;
  7783.  
  7784. &lt;p&gt;For this we use a &lt;a href=&quot;https://aws.amazon.com/dynamodb/&quot;&gt;DynamoDB&lt;/a&gt; table called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Services&lt;/code&gt;. We have one such table per AWS region, and each BidIQ instance in that region
  7785. announces itself on that table. The BidIQ instance periodically heartbeats an entry to the table consisting of:&lt;/p&gt;
  7786. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;HOST, PORT, BIDIQ_VERSION, EXPIRY_TS
  7787. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  7788.  
  7789. &lt;p&gt;BidIQ upholds the contract:&lt;/p&gt;
  7790. &lt;ol&gt;
  7791.  &lt;li&gt;Each &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(HOST, PORT)&lt;/code&gt; can be queried for that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIDIQ_VERSION&lt;/code&gt; until the current time exceeds &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EXPIRY_TS&lt;/code&gt;.&lt;/li&gt;
  7792.  &lt;li&gt;For each &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIDIQ_VERSION&lt;/code&gt; there will be sufficient &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(HOST, PORT)&lt;/code&gt; tuples to meet throughput.
  7793. To guarantee this, BidIQ ensures both:
  7794.    &lt;ul&gt;
  7795.      &lt;li&gt;For each &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HOST&lt;/code&gt;, at least one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PORT&lt;/code&gt; is available&lt;/li&gt;
  7796.      &lt;li&gt;We have sufficiently many &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HOST&lt;/code&gt;s for the given &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIDIQ_VERSION&lt;/code&gt;&lt;/li&gt;
  7797.    &lt;/ul&gt;
  7798.  &lt;/li&gt;
  7799. &lt;/ol&gt;
  7800.  
  7801. &lt;h2 id=&quot;model-updates&quot;&gt;Model Updates&lt;/h2&gt;
  7802.  
  7803. &lt;p&gt;Our pipeline is constantly pushing updates to the model as more recent data come in. We need to swap in these fresh models
  7804. while upholding the contract above.&lt;/p&gt;
  7805.  
  7806. &lt;p&gt;To do this, each instance in the BidIQ ASG uses the following logic, orchestrated by a Python script:&lt;/p&gt;
  7807. &lt;ul&gt;
  7808.  &lt;li&gt;Every five seconds, query the locally running model as a health check. If this health check fails, restart the model.&lt;/li&gt;
  7809.  &lt;li&gt;Every minute, check for an updated model. If found:
  7810.    &lt;ol&gt;
  7811.      &lt;li&gt;Download the new model and run it on a new port&lt;/li&gt;
  7812.      &lt;li&gt;Start heartbeating the new port to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Services&lt;/code&gt; table&lt;/li&gt;
  7813.      &lt;li&gt;Stop heartbeating the old port to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Services&lt;/code&gt; table&lt;/li&gt;
  7814.      &lt;li&gt;After a grace period for draining, shut down the model on the old port&lt;/li&gt;
  7815.    &lt;/ol&gt;
  7816.  &lt;/li&gt;
  7817. &lt;/ul&gt;
  7818.  
  7819. &lt;p&gt;This logic is simple, allows for smooth model updates with no down time, and requires no coordination between individual instances in a BidIQ ASG. And
  7820. our bidders don’t need to know anything about the internals of BidIQ. They simply look up a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(HOST, PORT)&lt;/code&gt; in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Services&lt;/code&gt; table for a given &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIDIQ_VERSION&lt;/code&gt; and
  7821. query it to get a bid price.&lt;/p&gt;
  7822.  
  7823. &lt;h1 id=&quot;improving-the-model&quot;&gt;Improving the Model&lt;/h1&gt;
  7824.  
  7825. &lt;p&gt;I’ve explained how we handle model updates for a given version of BidIQ, but how do we improve the model to make new versions of BidIQ?
  7826. Model improvement is broken into two phases, Backtesting and live A/B testing.&lt;/p&gt;
  7827.  
  7828. &lt;h2 id=&quot;backtesting&quot;&gt;Backtesting&lt;/h2&gt;
  7829.  
  7830. &lt;p&gt;For backtesting new models, we have a simple web application, “Juxtaposer”. A user can define an experiment via a YAML configuration consisting of which versions of our code to use,
  7831. what features to use in the model, which train and test sets to use, and even the Docker image to run the experiment in.&lt;/p&gt;
  7832.  
  7833. &lt;p&gt;We then kick off this dockerized experiment via AWS Batch.
  7834. When finished, it reports metrics back to the web app, where we can compare them side-by-side (hence the name “Juxtaposer”). With Docker and AWS Batch it is trivial to kick off
  7835. an arbitrarily large number of experiments in parallel, and our engineers can focus on improving the model without worrying about infrastructure.&lt;/p&gt;
  7836.  
  7837. &lt;figure&gt;
  7838.  &lt;img src=&quot;/images/post_images/just_binary_classifier/juxtaposer.png&quot; alt=&quot;Juxtaposer experiment comparison&quot; /&gt;
  7839.  &lt;figcaption&gt;Juxtaposer comparison of two experiments. Some metrics have been redacted&lt;/figcaption&gt;
  7840. &lt;/figure&gt;
  7841.  
  7842. &lt;p&gt;For feature engineering, we typically look at standard binary classification metrics (average log-loss, AUC, etc.) on the test set, whereas for changes to our code, we look out for
  7843. regressions—backtesting gives us a dry run of how the model will behave in production.&lt;/p&gt;
  7844.  
  7845. &lt;h2 id=&quot;live-ab-testing&quot;&gt;Live A/B Testing&lt;/h2&gt;
  7846.  
  7847. &lt;p&gt;Once we have a new model that looks good in backtesting, we A/B test it live. For this we use &lt;a href=&quot;/blog/rtb/2016/04/29/collider.html&quot;&gt;Collider&lt;/a&gt;. In a nutshell, Collider instructs
  7848. the bidders to apply custom logic to a fixed percentage of live traffic. For our purposes, we simply tell the bidders to use a specified &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIDIQ_VERSION&lt;/code&gt; rather than the
  7849. production version.&lt;/p&gt;
  7850.  
  7851. &lt;p&gt;For A/B testing, in addition to binary classification metrics, we look at broader ad-tech metrics such as our observed cost-per-click (CPC) or cost-per-action (CPA). If the test version of BidIQ
  7852. looks good we roll it out as the new production version.&lt;/p&gt;
  7853.  
  7854. &lt;h1 id=&quot;monitoring-our-production-model&quot;&gt;Monitoring our Production Model&lt;/h1&gt;
  7855.  
  7856. &lt;p&gt;With the great power of BidIQ (pricing one million ads per second, each within 20ms, 24/7) comes great responsibility: BidIQ is an automated system which controls how we spend money so it
  7857. is imperative that we monitor it closely, especially as much of the data it ingests comes from ad exchanges, which may introduce changes outside of our control.&lt;/p&gt;
  7858.  
  7859. &lt;p&gt;To start, we use &lt;a href=&quot;https://www.datadoghq.com/&quot;&gt;Datadog&lt;/a&gt; for real-time monitoring. Each BidIQ server outputs simple metrics in UDP packets, using the &lt;a href=&quot;https://github.com/etsy/statsd&quot;&gt;StatsD&lt;/a&gt; protocol, which are picked up by
  7860. a local Datadog agent, aggregated, and displayed in a real-time dashboard. In this way, we can see our live average predictions and average bids across a number of dimensions
  7861. and send a &lt;a href=&quot;https://www.pagerduty.com/&quot;&gt;Pager Duty&lt;/a&gt; alert if something suddenly changes.&lt;/p&gt;
  7862.  
  7863. &lt;figure&gt;
  7864.  &lt;img src=&quot;/images/post_images/just_binary_classifier/datadog.png&quot; alt=&quot;Datadog stacked area chart&quot; /&gt;
  7865.  &lt;figcaption&gt;Realtime stacked area chart via Datatog of queries/sec by AWS region&lt;/figcaption&gt;
  7866. &lt;/figure&gt;
  7867.  
  7868. &lt;p&gt;Next, our bidders log the bid prices and click predictions that BidIQ returns in our impression and click logs on S3. Using &lt;a href=&quot;https://prestodb.io/&quot;&gt;Presto&lt;/a&gt;, we can
  7869. easily query these logs with SQL. With some simple SQL queries we can check, on various subsets of traffic, whether the ratio of expected clicks (sum of click-probabilities over impressions)
  7870. to actual clicks is close to one, as it should be. With slightly more complicated queries we can build &lt;a href=&quot;http://scikit-learn.org/stable/auto_examples/calibration/plot_calibration_curve.html&quot;&gt;calibration curves&lt;/a&gt; to verify that our predictions
  7871. have been accurate.&lt;/p&gt;
  7872.  
  7873. &lt;figure&gt;
  7874.  &lt;img src=&quot;/images/post_images/just_binary_classifier/calibration.png&quot; alt=&quot;Calibration chart&quot; /&gt;
  7875.  &lt;figcaption&gt;Calibration chart -- points on the dotted line are perfect.&lt;/figcaption&gt;
  7876. &lt;/figure&gt;
  7877.  
  7878. &lt;p&gt;Next, we expose our joined training logs to Presto so we can directly query the data BidIQ is ingesting. We also have a script that taps our live BidIQ servers (using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tcpdump&lt;/code&gt;)
  7879. to see exactly what data our bidders are sending them. These have proven invaluable in investigating changes in upstream data.&lt;/p&gt;
  7880.  
  7881. &lt;p&gt;Next, most of our tasks deliver email messages on failure, and our task to train the model runs additional sanity checks to ensure that parameters such as average prediction over
  7882. the test set are within reasonable bounds. We also have a catch-all monitoring repository, “Night’s Watch,” which contains a series of Python scripts which send us warning
  7883. emails if anything is off (e.g. a BidIQ model hasn’t been updated recently).&lt;/p&gt;
  7884.  
  7885. &lt;p&gt;Finally, each of our code repos has a thorough set of unit and integration tests.&lt;/p&gt;
  7886.  
  7887. &lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
  7888.  
  7889. &lt;p&gt;Hopefully the above gives you a sense of the architecture and infrastructure involved in bringing even a simple binary classifier into stable production at scale.&lt;/p&gt;
  7890.  
  7891. &lt;p&gt;And that’s just our high-level architecture. With our predictors being so critical to our business, we’ve written custom code to train and serve our models in a
  7892. low-level language, &lt;a href=&quot;/blog/data/2014/11/17/d-is-for-data-science.html&quot;&gt;D&lt;/a&gt;. We’ve optimized our D code for our use cases: we’ve disabled garbage collection (violates our latency requirement), removed any allocations on the heap,
  7893. rewritten performance bottlenecks using &lt;a href=&quot;https://en.wikipedia.org/w/index.php?title=Intrinsic_function&amp;amp;oldid=827077730&quot;&gt;intrinsics&lt;/a&gt;, added logic to price a batch of ads at a time with multithreading, and more. And on the algorithm side, we now use an extension of logistic regression, &lt;a href=&quot;/blog/data-science/2015/08/25/factorization-machines.html&quot;&gt;factorization machines&lt;/a&gt;.&lt;/p&gt;
  7894.  
  7895. &lt;p&gt;So far we’ve only discussed our click predictor. We have a number of other predictors with their own quirks as well as an ever-evolving set of logic to choose a bid price in the face
  7896. of evolving auction dynamics, notably the recent rise in &lt;a href=&quot;http://adprofs.co/beginners-guide-to-header-bidding/&quot;&gt;header bidding&lt;/a&gt;.&lt;/p&gt;
  7897.  
  7898. &lt;p&gt;The best thing about working on Data Science Engineering at AdRoll is that you work at the intersection of so many interesting fields: math, statistics, computer science,
  7899. machine learning, computer networking, economics, game theory, and even web development. Most of our projects overlap with a number of these fields, and our team has a diverse group of engineers with backgrounds
  7900. across these fields so there is always something new to learn and someone to learn from. We are regularly looking for new engineers so if these areas interest you, &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;let us know&lt;/a&gt;!&lt;/p&gt;
  7901.  
  7902. &lt;h1 id=&quot;-appendix&quot;&gt;&lt;a name=&quot;appendix&quot;&gt;&lt;/a&gt; Appendix&lt;/h1&gt;
  7903.  
  7904. &lt;p&gt;List of AWS tools we use&lt;/p&gt;
  7905. &lt;ul&gt;
  7906.  &lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/autoscaling/&quot;&gt;ASG&lt;/a&gt; – scalable group of EC2 instances&lt;/li&gt;
  7907.  &lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/batch/&quot;&gt;Batch&lt;/a&gt; – easy way to run a docker image&lt;/li&gt;
  7908.  &lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/dynamodb/&quot;&gt;DynamoDB&lt;/a&gt; – low latency NoSQL database&lt;/li&gt;
  7909.  &lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/ec2/&quot;&gt;EC2&lt;/a&gt; – on-demand machines to run our code&lt;/li&gt;
  7910.  &lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/ecr/&quot;&gt;ECR&lt;/a&gt; – repo for docker images&lt;/li&gt;
  7911.  &lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/s3/&quot;&gt;S3&lt;/a&gt; – cloud storage for flat files&lt;/li&gt;
  7912. &lt;/ul&gt;
  7913.  
  7914. &lt;p&gt;List of third party / open source tools we use&lt;/p&gt;
  7915. &lt;ul&gt;
  7916.  &lt;li&gt;&lt;a href=&quot;https://www.datadoghq.com/&quot;&gt;Datadog&lt;/a&gt; – reporting for live metrics&lt;/li&gt;
  7917.  &lt;li&gt;&lt;a href=&quot;https://www.docker.com/&quot;&gt;Docker&lt;/a&gt; – simple containerization&lt;/li&gt;
  7918.  &lt;li&gt;&lt;a href=&quot;https://github.com/spotify/luigi&quot;&gt;Luigi&lt;/a&gt; – simple pipeline logic to kick off jobs once upstream ones finish&lt;/li&gt;
  7919.  &lt;li&gt;&lt;a href=&quot;https://www.pagerduty.com/&quot;&gt;Pager Duty&lt;/a&gt; – alarm system to alert engineers when something goes wrong&lt;/li&gt;
  7920.  &lt;li&gt;&lt;a href=&quot;https://prestodb.io/&quot;&gt;Presto&lt;/a&gt; – fast SQL query engine which can also query large csv files on S3&lt;/li&gt;
  7921.  &lt;li&gt;&lt;a href=&quot;https://www.terraform.io/&quot;&gt;Terraform&lt;/a&gt; – simple declaritive language to create and modify AWS infrastructure&lt;/li&gt;
  7922. &lt;/ul&gt;
  7923.  
  7924. &lt;hr /&gt;
  7925.  
  7926. &lt;p&gt;&lt;strong&gt;Do you enjoy designing and deploying machine learning models at scale? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  7927.  
  7928. </description>
  7929.    </item>
  7930.    
  7931.    
  7932.    
  7933.    <item>
  7934.      <title>
  7935. CodeBeam SF 2018: Erlang and Elixir in production
  7936. </title>
  7937.      <link>https://tech.nextroll.com/blog/dev/2018/04/12/codebeam-sf-2018.html</link>
  7938.      <pubDate>Thu, 12 Apr 2018 00:00:00 -0700</pubDate>
  7939.      <author></author>
  7940.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2018/04/12/codebeam-sf-2018</guid>
  7941.      <description>&lt;p&gt;Highlights from the CodeBeam SF 2018 conference, March 15-16 2018&lt;/p&gt;
  7942.  
  7943. &lt;hr /&gt;
  7944.  
  7945. &lt;h2 id=&quot;once-was-erlang-factory&quot;&gt;Once was Erlang Factory&lt;/h2&gt;
  7946. &lt;p&gt;The name has changed - yearly in fact, for the past few years - but this conference is always a destination for the &lt;a href=&quot;http://www.erlang.org&quot;&gt;Erlang&lt;/a&gt; faithful. The premise has been broadened since the &lt;a href=&quot;http://www.erlang-factory.com/past_conferences&quot;&gt;Erlang Factory&lt;/a&gt; days of yore to welcome other &lt;a href=&quot;http://www.erlang-factory.com/upload/presentations/708/HitchhikersTouroftheBEAM.pdf&quot;&gt;BEAM&lt;/a&gt; based languages, the most prominent of those being &lt;a href=&quot;https://elixir-lang.org&quot;&gt;Elixir&lt;/a&gt; of course, but worthy mentions here are &lt;a href=&quot;http://lfe.io&quot;&gt;LFE (Lisp Flavored Erlang)&lt;/a&gt; as well as the more experimental efforts happening: &lt;a href=&quot;https://github.com/rvirding/erlog&quot;&gt;Erlog&lt;/a&gt;, &lt;a href=&quot;http://clojerl.org&quot;&gt;Clojerl&lt;/a&gt;, &lt;a href=&quot;https://github.com/rvirding/luerl&quot;&gt;Luerl&lt;/a&gt;, &lt;a href=&quot;http://efene.org&quot;&gt;Efene&lt;/a&gt;, and &lt;a href=&quot;https://github.com/alpaca-lang/alpaca&quot;&gt;Alpaca&lt;/a&gt;. I find the big tent to be a really great thing for the Erlang ecosystem in general. It’s good to be reminded that one of the lasting beauties of Erlang is the reusability and relevance of &lt;a href=&quot;http://www.erlang-factory.com/upload/presentations/708/HitchhikersTouroftheBEAM.pdf&quot;&gt;BEAM&lt;/a&gt; itself to the networked world we work in.&lt;/p&gt;
  7947.  
  7948. &lt;p&gt;The subplot this year is a good one: 2018 marks the 20th year since Ericsson first released Erlang as open source! This has had much positive influence on the lifetime of this language and BEAM.&lt;/p&gt;
  7949.  
  7950. &lt;h2 id=&quot;highlights&quot;&gt;Highlights&lt;/h2&gt;
  7951. &lt;p&gt;The two-day conference featured 50 speakers in all, with two keynotes each day. The organizers, &lt;a href=&quot;https://www.codesync.global&quot;&gt;Code Sync&lt;/a&gt;, have merged the long-running &lt;a href=&quot;http://www.erlang-factory.com/past_conferences&quot;&gt;Erlang Factory&lt;/a&gt; event series under their umbrella in the past year, alongside the &lt;a href=&quot;https://www.codesync.global/conferences/#CodeElixir&quot;&gt;Code Elixir&lt;/a&gt; and &lt;a href=&quot;https://www.codesync.global/conferences/#CodeMesh&quot;&gt;Code Mesh&lt;/a&gt; series. This seems to have worked out well. The Code Beam conference certainly went very smoothly - the scheduling made sense, presentation technology worked properly, and the volunteers were knowlegeable and helpful. I should note the integration of the &lt;a href=&quot;https://whova.com&quot;&gt;Whova&lt;/a&gt; app for navigating around between talks, the darned thing actually worked really well.&lt;/p&gt;
  7952.  
  7953. &lt;p&gt;The conference was organized into six &lt;a href=&quot;https://www.codesync.global/conferences/code-beam-sf-2018/#Themes&quot;&gt;themes&lt;/a&gt;: introductory tutorials and overviews of the Erlang ecosystem; the state of the BEAM toolset for production environments. deployment, maintenance, monitoring, and testing; case studies in high reliablity system building; in-depth BEAM topics, including compiler implementation issues, language extensions, and active areas of research; frameworks including Phoenix, RabbitMQ, Nerves, MongooseIM and others; and last but far from least distribution, concurrency, multicore and functional programming in the Erlang ecosystem.&lt;/p&gt;
  7954.  
  7955. &lt;p&gt;The following list of highlights is fairly randomly chosen across those tracks, as there were quite a few contenders. This is just a sampling of talks that struck me as particularly memorable - a number of solid tutorials and more general reflections are skipped here. I highly recommend paging through the full set of recorded talks on &lt;a href=&quot;https://www.youtube.com/results?search_query=Code+BEAM+SF+2018&quot;&gt;on youtube&lt;/a&gt;.&lt;/p&gt;
  7956.  
  7957. &lt;h2 id=&quot;openerlang-party&quot;&gt;#OpenErlang Party&lt;/h2&gt;
  7958. &lt;p&gt;The organizers celebrated 20 years of open source Erlang by throwing &lt;a href=&quot;https://www.meetup.com/CodeBEAMSF/events/248435574&quot;&gt;a very nice bash&lt;/a&gt; at &lt;a href=&quot;https://www.galvanize.com&quot;&gt;Galvanize&lt;/a&gt; the first evening. &lt;a href=&quot;https://www.codesync.global/speaker/miriam-pena&quot;&gt;Miriam&lt;/a&gt; worked the room:&lt;/p&gt;
  7959.  
  7960. &lt;h5 id=&quot;miriam-and-robert-virding&quot;&gt;Miriam and Robert Virding&lt;/h5&gt;
  7961. &lt;p&gt;&lt;img src=&quot;/images/post_images/miriamandrobert.jpeg&quot; alt=&quot;miriam-and-robert&quot; /&gt;&lt;/p&gt;
  7962. &lt;h5 id=&quot;miriam-and-irina-guberman&quot;&gt;Miriam and Irina Guberman&lt;/h5&gt;
  7963. &lt;p&gt;&lt;img src=&quot;/images/post_images/miriamandirina.jpeg&quot; alt=&quot;miriam-and-irina&quot; /&gt;&lt;/p&gt;
  7964. &lt;h5 id=&quot;miriam-and-andrew-thompson&quot;&gt;Miriam and Andrew Thompson&lt;/h5&gt;
  7965. &lt;p&gt;&lt;img src=&quot;/images/post_images/miriamandandrew.jpeg&quot; alt=&quot;miriam-and-andrew&quot; /&gt;&lt;/p&gt;
  7966. &lt;h5 id=&quot;miriam-and-lennart-öhman&quot;&gt;Miriam and Lennart Öhman&lt;/h5&gt;
  7967. &lt;p&gt;&lt;img src=&quot;/images/post_images/miriamandLennart.jpeg&quot; alt=&quot;miriam-and-Lennart&quot; /&gt;&lt;/p&gt;
  7968.  
  7969. &lt;h2 id=&quot;adroll-content&quot;&gt;AdRoll Content&lt;/h2&gt;
  7970. &lt;p&gt;I was most excited about the representation by AdRoll among the speakers, unsurprisingly! Three members of the RTB team spoke, with an eye-opening historical keynote about the significant but infrequently-mentioned contributions of women to the development of &lt;a href=&quot;http://www.erlang.org&quot;&gt;Erlang&lt;/a&gt; and &lt;a href=&quot;http://learnyousomeerlang.com/what-is-otp&quot;&gt;OTP (Open Telecom Platform)&lt;/a&gt; from our own &lt;a href=&quot;https://www.codesync.global/speaker/miriam-pena&quot;&gt;Miriam Pena&lt;/a&gt;:&lt;/p&gt;
  7971. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/j6wbuV8pMx8&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7972.  
  7973. &lt;p&gt;Also on the first day, our own &lt;a href=&quot;https://www.codesync.global/speaker/brujo-benavides&quot;&gt;Brujo Benavides&lt;/a&gt; delivered a solid rationale for the practice of using opaque (module-scoped) data types in &lt;a href=&quot;http://www.erlang.org&quot;&gt;Erlang&lt;/a&gt; code, for maintainability and readbility reasons:&lt;/p&gt;
  7974. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/hxLh0qFX9oI&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7975.  
  7976. &lt;p&gt;And of course &lt;a href=&quot;https://www.codesync.global/speaker/mike-watters&quot;&gt;Mike Watters&lt;/a&gt; described his recent rewrite of the RTB profile cache system using &lt;a href=&quot;https://elixir-lang.org&quot;&gt;Elixir&lt;/a&gt; and the &lt;a href=&quot;https://hexdocs.pm/flow/Flow.html&quot;&gt;Flow framework&lt;/a&gt;, which increased peak throughput by 2X on that critical system (described in his blog post &lt;a href=&quot;/blog/dev/2018/01/08/quaff-that-potion-saving-millions-with-elixir-and-erlang.html&quot;&gt;here&lt;/a&gt; as well):&lt;/p&gt;
  7977. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/CvjVpGktLC0&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7978.  
  7979. &lt;h2 id=&quot;the-usual-suspects&quot;&gt;The usual suspects&lt;/h2&gt;
  7980. &lt;p&gt;An Erlang convocation is not really complete without a keynote from &lt;a href=&quot;https://www.codesync.global/speaker/joe-armstrong&quot;&gt;Joe Armstrong&lt;/a&gt; needless to say. We were treated to a customary eclectic dilation on topics arcane, perhaps best described as a paleobiological view across the brief decades since computer science became a thing… not unlike a peek at the &lt;a href=&quot;https://en.wikipedia.org/wiki/Burgess_Shale&quot;&gt;Burgess Shale&lt;/a&gt;, but for computers:&lt;/p&gt;
  7981. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/-I_jE0l7sYQ&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7982.  
  7983. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/robert-virding&quot;&gt;Robert Virding&lt;/a&gt; always has something fascinating to say and he joined with &lt;a href=&quot;https://www.codesync.global/speaker/mariano-guerra&quot;&gt;Mariano Guerra&lt;/a&gt; to present a survey of some language implementation projects based on &lt;a href=&quot;http://www.erlang-factory.com/upload/presentations/708/HitchhikersTouroftheBEAM.pdf&quot;&gt;BEAM&lt;/a&gt;, along the way describing the contortions big or small that these involved:&lt;/p&gt;
  7984. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/lkAbwmn5Rv8&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7985.  
  7986. &lt;p&gt;Naturally I have to mention fabulous &lt;a href=&quot;https://www.codesync.global/speaker/fred-hebert&quot;&gt;Fred Hebert&lt;/a&gt;, scribe and scholar responsible for must-have references &lt;a href=&quot;http://learnyousomeerlang.com&quot;&gt;Learn You Some Erlang For Great Good&lt;/a&gt; and &lt;a href=&quot;http://www.erlang-in-anger.com&quot;&gt;Erlang in Anger&lt;/a&gt;, who talked about the importance of using supervisor trees to prepare for and gracefully handle unexpected complex failures in distributed systems:&lt;/p&gt;
  7987. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/W0BR_tWZChQ&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7988.  
  7989. &lt;p&gt;My introduction to property based testing came via &lt;a href=&quot;https://www.codesync.global/speaker/kostis-sagonas1&quot;&gt;Kostis Sagonas&lt;/a&gt; in his talk about &lt;a href=&quot;http://proper.softlab.ntua.gr&quot;&gt;PropEr&lt;/a&gt;, &lt;a href=&quot;http://concuerror.com&quot;&gt;Concuerror&lt;/a&gt; and using optimal partial order reduction to find outlier bugs in your code without waiting 48 days for naive exhaustive tests to complete:&lt;/p&gt;
  7990. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/4-vJeCmkCZE&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7991.  
  7992. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/scott-lystig-fritchie56&quot;&gt;Scott Lystig Fritchie&lt;/a&gt; talked about why he &lt;em&gt;isn’t&lt;/em&gt; using Erlang at &lt;a href=&quot;https://www.wallaroolabs.com&quot;&gt;Wallaroo Labs&lt;/a&gt; - the answer is &lt;a href=&quot;https://www.ponylang.org&quot;&gt;Pony&lt;/a&gt; and it’s complicated:&lt;/p&gt;
  7993. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/uv-3ptTD8hg&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7994.  
  7995. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/raimo-niskanen&quot;&gt;Raimo Niskanen&lt;/a&gt; walked us through the &lt;em&gt;gen_statem&lt;/em&gt; OTP behaviour class, introduced in OTP 19.0, which simplifies the API surface of the original &lt;em&gt;gen_fsm&lt;/em&gt; behaviour while at the same time adding a significant amount of extra functionality and ease of use:&lt;/p&gt;
  7996. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/f_jl6MR3kXQ&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  7997.  
  7998. &lt;h2 id=&quot;file-under&quot;&gt;File under…&lt;/h2&gt;
  7999. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/andrew-thompson54&quot;&gt;Andrew Thompson&lt;/a&gt; described a IoT-to-cloud communications scheme implemented largely in Erlang (in software and firmware) on &lt;a href=&quot;https://www.helium.com&quot;&gt;Helium’s&lt;/a&gt; low-power bidirectional 802.15.4 based radio module, the &lt;a href=&quot;https://www.helium.com/products/atom-prototyping-module&quot;&gt;Atom&lt;/a&gt; - which operates using a variation of blockchain architecture based on ‘proof of coverage’ rather than the unwieldy ‘proof of work’ most of us think of when talking about blockchains:&lt;/p&gt;
  8000. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/QnHYWNfvV2c&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  8001.  
  8002. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/irina-guberman&quot;&gt;Irina Guberman&lt;/a&gt; gave an in-depth look at the decisions her team made while implementing distributed mutable counters for metrics:&lt;/p&gt;
  8003. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/peKLuq-tdQo&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  8004.  
  8005. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/emma-cunningham53&quot;&gt;Emma Cunningham&lt;/a&gt;, who was a formal semanticist before joining the engineering world, began a review of the mathematical and philosophical origins of type theory with the memorable sentence, “I’ve been a functional programming advocate since before I became a software engineer” - she concludes with a compelling argument to &lt;em&gt;just use Dialyzer for all the things&lt;/em&gt;:&lt;/p&gt;
  8006. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/yZO6FkkdJEg&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  8007.  
  8008. &lt;p&gt;An overview of &lt;a href=&quot;https://www.amqp.org&quot;&gt;AMQP&lt;/a&gt; and &lt;a href=&quot;https://www.rabbitmq.com&quot;&gt;RabbitMQ&lt;/a&gt;, the poster child for Erlang implementations in the wild, was given by &lt;a href=&quot;https://www.codesync.global/speaker/brett-cameron&quot;&gt;Brett Cameron&lt;/a&gt;:&lt;/p&gt;
  8009. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/wN0jqAHmqXc&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  8010.  
  8011. &lt;p&gt;Lastly I include a fun talk given by &lt;a href=&quot;https://www.codesync.global/speaker/simon-thompson&quot;&gt;Simon Thompson&lt;/a&gt; that describes a &lt;em&gt;very functional approach&lt;/em&gt; to implementing lazy evaluation - it’s a great idea for an interview problem too:&lt;/p&gt;
  8012. &lt;figure&gt;&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/uh7qYf8rC00&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/figure&gt;
  8013.  
  8014. &lt;h2 id=&quot;lightning-talks&quot;&gt;Lightning talks&lt;/h2&gt;
  8015. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/brujo-benavides&quot;&gt;Brujo Benavides&lt;/a&gt; talked about &lt;a href=&quot;https://spawnfest.github.io&quot;&gt;SpawnFest&lt;/a&gt; (mostly… okay he also proselytized for ROK style leading commas in your source, but we love him anyway):&lt;/p&gt;
  8016. &lt;iframe src=&quot;//slides.com/elbrujohalcon/deck-3/embed&quot; scrolling=&quot;no&quot; frameborder=&quot;0&quot; webkitallowfullscreen=&quot;&quot; mozallowfullscreen=&quot;&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;
  8017.  
  8018. &lt;p&gt;&lt;a href=&quot;http://joedevivo.com&quot;&gt;Joe DeVivo&lt;/a&gt; talked about a composable modeling scheme for 3D printing with Elixir.&lt;/p&gt;
  8019.  
  8020. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/mariano-guerra&quot;&gt;Mariano Guerra&lt;/a&gt; talked about many things, including &lt;a href=&quot;http://efene.org&quot;&gt;Efene&lt;/a&gt; and a call for polyglot ADTs over BEAM.&lt;/p&gt;
  8021.  
  8022. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/erik-stenman&quot;&gt;Erik Stenman&lt;/a&gt; talked about using &lt;a href=&quot;http://erlang.org/doc/man/HiPE_app.html&quot;&gt;HiPE&lt;/a&gt; inspection tools to peer at Erlang’s innards.&lt;/p&gt;
  8023.  
  8024. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/fred-hebert&quot;&gt;Fred Hebert&lt;/a&gt; talked about what we like and what we hate about Erlang… plus he did it upside down on his laptop:
  8025. &lt;img src=&quot;/images/post_images/fredupsidedown.jpeg&quot; alt=&quot;fred-upside-down&quot; /&gt;&lt;/p&gt;
  8026.  
  8027. &lt;h2 id=&quot;updates-from-the-mothership&quot;&gt;Updates from the mothership&lt;/h2&gt;
  8028. &lt;p&gt;Naturally we were treated to the latest state of the union announcements from the Erlang and Elixir support teams:&lt;/p&gt;
  8029.  
  8030. &lt;h2 id=&quot;update-otp-210&quot;&gt;Update: OTP 21.0&lt;/h2&gt;
  8031. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/raimo-niskanen&quot;&gt;Raimo Niskanen&lt;/a&gt; talked about the &lt;a href=&quot;https://youtu.be/hHhm0bfdj-4&quot;&gt;changes coming up in OTP&lt;/a&gt;:&lt;/p&gt;
  8032.  
  8033. &lt;p&gt;OTP 21.0 is due out June 20 2018 - we’re looking forward to full stack traces from exceptions, optimized small maps, the introduction of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;logger&lt;/code&gt; module, and I personally am delighted to see &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lists:search(List, fun/1)&lt;/code&gt; added to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lists&lt;/code&gt; module! Don’t judge me.&lt;/p&gt;
  8034.  
  8035. &lt;p&gt;Interesting mentions in the “Longer term plans” segment of Raimo’s summary: more compiler optimizations using Single Static Assignment representation, profiling tool improvements, Erlang Language Server Protocol to get support for Erlang in common editors, oh and the unicorn was spotted - sorry, the JIT compiler is approaching HiPE performance now…&lt;/p&gt;
  8036.  
  8037. &lt;h2 id=&quot;update-elixir-16&quot;&gt;Update: Elixir 1.6&lt;/h2&gt;
  8038. &lt;p&gt;&lt;a href=&quot;https://www.codesync.global/speaker/james-fish&quot;&gt;James Fish&lt;/a&gt; walked us through the &lt;a href=&quot;https://youtu.be/l4ISqxmZQtE&quot;&gt;significant developments in Elixir 1.6&lt;/a&gt;:&lt;/p&gt;
  8039.  
  8040. &lt;p&gt;The Elixir team is focusing on productivity, maintainability, reliability… Version 1.6 features a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;go fmt&lt;/code&gt; style code formatting module, which is good for all the reasons that a stable and universal source code format is good. It is callable from code and from the command line. The AST has been surfaced with an API and some utilities that will allow term structure of code to be inspected easily. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;require/1,2&lt;/code&gt; has been added to notify one’s compiler and one’s self about [in particular] macro dependency relations. Two new attributes, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@deprecated&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@since&lt;/code&gt;, mark whether a function or macro is deprecated and when that happened. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;defguard/defguardp&lt;/code&gt; generate macros suitable for use in guard expressions, enhancing readbility. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DynamicSupervisor&lt;/code&gt; is a new behaviour, or more precisely a reformed behaviour, and I quote James here: “A replacement for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;simple_one_for_one&lt;/code&gt;, but actually simple”&lt;/p&gt;
  8041.  
  8042. &lt;p&gt;Plans for the subsequent major release at the end of 2018 (1.7) include a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix&lt;/code&gt; which will generate releases, a new PBT tool called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StreamData&lt;/code&gt; which does data generation and simple PBT, better Dialyzer support, and greatly improved documentation.&lt;/p&gt;
  8043.  
  8044. &lt;h2 id=&quot;last-thoughts&quot;&gt;Last thoughts&lt;/h2&gt;
  8045. &lt;p&gt;As usual I’m invigorated by attending this conference. The breadth and depth of work being done using BEAM is remarkable - IoT/Edge netorks/embedded applications, language-agnostic testing frameworks, high performance low-latency distributed computation… Erlang and its progeny have passed through a number of major paradigm shifts in this industry and seem to always stay relevant, somehow offering something new even as the tech stack grows broader and deeper.&lt;/p&gt;
  8046.  
  8047. &lt;p&gt;It is always a good feeling to be present at a tech conference when you have a horse in the race - and in this case Adroll very much does, being prominent Erlang proponents and users. It seems self-evident that showing up and participating is a vital part of selling our technology and engineering culture to our peers, and that can only be a good thing, if for no other reason than helping attract motivated and informed people to our team!&lt;/p&gt;
  8048.  
  8049. &lt;p&gt;&lt;strong&gt;Are you interested in working with Erlang and Elixir in a high-performance globally distributed environment? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  8050.  
  8051. </description>
  8052.    </item>
  8053.    
  8054.    
  8055.    
  8056.    <item>
  8057.      <title>
  8058. Writing Elixir stubs for better testing
  8059. </title>
  8060.      <link>https://tech.nextroll.com/blog/dev/2018/03/28/elixir-stubs-for-tests.html</link>
  8061.      <pubDate>Wed, 28 Mar 2018 00:00:00 -0700</pubDate>
  8062.      <author></author>
  8063.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2018/03/28/elixir-stubs-for-tests</guid>
  8064.      <description>&lt;p&gt;In this post we’re going to see a handy technique to create and use stubs
  8065. for your Elixir projects, leading to better tests, more maintainable code, and
  8066. a lot of fun while using ETS, match specs, and macros.&lt;/p&gt;
  8067.  
  8068. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;15-20 minute read&lt;/code&gt;&lt;/p&gt;
  8069.  
  8070. &lt;hr /&gt;
  8071.  
  8072. &lt;h1 id=&quot;lets-jump-right-in-untestable-code&quot;&gt;Let’s jump right in: Untestable code&lt;/h1&gt;
  8073.  
  8074. &lt;p&gt;As an example, let’s say you have a module that uses some dependency to query
  8075. DynamoDB (although note that this could be &lt;em&gt;any other type of
  8076. external service&lt;/em&gt;), and your code looks like this:&lt;/p&gt;
  8077.  
  8078. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;UsesAws&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8079.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;do_something_with_dynamodb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8080.    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;DynamoDependency&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8081.    &lt;span class=&quot;n&quot;&gt;other_stuff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8082.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8083. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8084.  
  8085. &lt;p&gt;As you can see we don’t stand a chance to test this code without doing a real
  8086. request to DynamoDB, because our code is tightly coupled to a specific
  8087. implementation.&lt;/p&gt;
  8088.  
  8089. &lt;p&gt;Experienced erlang devs would say “mock it”! Let’s dig a bit deeper into that option.&lt;/p&gt;
  8090.  
  8091. &lt;h2 id=&quot;stubs-or-mocks&quot;&gt;Stubs or mocks?&lt;/h2&gt;
  8092. &lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Method_stub&quot;&gt;A stub&lt;/a&gt; is a piece of code that implements a
  8093. contract, pre-programmed to return specific responses so you can test how your code
  8094. behaves when handling those responses.&lt;/p&gt;
  8095.  
  8096. &lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Mock_object&quot;&gt;A mock&lt;/a&gt;, is similar, but also includes &lt;em&gt;expectations&lt;/em&gt; (like “make sure
  8097. this function is called N times with these args”, etc). You will usually verify
  8098. all these expectations at the end of your test case.&lt;/p&gt;
  8099.  
  8100. &lt;h2 id=&quot;fine-use-meck&quot;&gt;Fine, use meck…&lt;/h2&gt;
  8101. &lt;p&gt;… or rather don’t. &lt;a href=&quot;https://github.com/eproxus/meck&quot;&gt;Meck&lt;/a&gt; is a mocking
  8102. library for Erlang. It is &lt;em&gt;sometimes&lt;/em&gt; used for stubbing rather than mocking.&lt;/p&gt;
  8103.  
  8104. &lt;p&gt;It has a few downsides, as &lt;a href=&quot;https://twitter.com/elbrujohalcon&quot;&gt;Brujo&lt;/a&gt; mentions in
  8105. &lt;a href=&quot;/blog/dev/2018/02/27/erlang-speed-tests.html&quot;&gt;a previous post&lt;/a&gt;, including:&lt;/p&gt;
  8106.  
  8107. &lt;ul&gt;
  8108.  &lt;li&gt;Overhead to setup, tear down, and reset the mocked modules&lt;/li&gt;
  8109.  &lt;li&gt;Difficulty running tests concurrently.&lt;/li&gt;
  8110.  &lt;li&gt;Complex dependencies which increase as we mock more tests.&lt;/li&gt;
  8111. &lt;/ul&gt;
  8112.  
  8113. &lt;p&gt;With these issues in mind, we should &lt;strong&gt;consider injecting our dependencies, and
  8114. using stubs rather than mocking everything we need&lt;/strong&gt;.&lt;/p&gt;
  8115.  
  8116. &lt;h2 id=&quot;making-the-code-more-testable&quot;&gt;Making the code more testable&lt;/h2&gt;
  8117. &lt;p&gt;With only a small change, and by using &lt;a href=&quot;https://elixir-lang.org/getting-started/module-attributes.html&quot;&gt;module attributes&lt;/a&gt;
  8118. we can really improve the initial situation and end up with this code:&lt;/p&gt;
  8119.  
  8120. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;UsesAws&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8121.  &lt;span class=&quot;nv&quot;&gt;@dynamodb_backend&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Application&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8122.    &lt;span class=&quot;ss&quot;&gt;:my_app&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:dynamodb_backend&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;DynamoDefaultImplementation&lt;/span&gt;
  8123.  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8124.  
  8125.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;do_something&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8126.    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;@dynamodb_backend&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8127.    &lt;span class=&quot;ss&quot;&gt;:expected_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;
  8128.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8129. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8130.  
  8131. &lt;p&gt;And that’s how we’ve decoupled our code from a specific implementation. Now we
  8132. can “inject” a stub and avoid using meck altogether.&lt;/p&gt;
  8133.  
  8134. &lt;p&gt;This was actually discussed by José Valim in &lt;a href=&quot;http://blog.plataformatec.com.br/2015/10/mocks-and-explicit-contracts/&quot;&gt;a 2015 post&lt;/a&gt;.&lt;/p&gt;
  8135.  
  8136. &lt;h2 id=&quot;injecting-the-stub&quot;&gt;Injecting the stub&lt;/h2&gt;
  8137. &lt;p&gt;In this case, the way would be to define a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.exs&lt;/code&gt; file like this:&lt;/p&gt;
  8138.  
  8139. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;kn&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Mix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Config&lt;/span&gt;
  8140.  
  8141. &lt;span class=&quot;c1&quot;&gt;# Our config goes here...&lt;/span&gt;
  8142.  
  8143. &lt;span class=&quot;n&quot;&gt;import_config&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Mix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;.exs&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8144.  
  8145. &lt;p&gt;And then a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test.exs&lt;/code&gt; file like this:&lt;/p&gt;
  8146.  
  8147. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;kn&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Mix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Config&lt;/span&gt;
  8148.  
  8149. &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:my_app&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8150.  &lt;span class=&quot;ss&quot;&gt;dynamodb_backend:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Dynamo&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8151.  
  8152. &lt;p&gt;Then, when running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix test&lt;/code&gt; the right module name will be compiled and used
  8153. in our code (which is our stub).&lt;/p&gt;
  8154.  
  8155. &lt;h2 id=&quot;sample-stub-and-test&quot;&gt;Sample stub and test&lt;/h2&gt;
  8156. &lt;p&gt;Let’s create the stub for this dependency. We’re going to put it in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test/lib/stub/dynamo.ex&lt;/code&gt;.&lt;/p&gt;
  8157.  
  8158. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Dynamo&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8159.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;this_table&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;this_query&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8160.    &lt;span class=&quot;ss&quot;&gt;:expected_value&lt;/span&gt;
  8161.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8162.  
  8163.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8164.    &lt;span class=&quot;ss&quot;&gt;:default_value&lt;/span&gt;
  8165.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8166. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8167.  
  8168. &lt;p&gt;The test is pretty straightforward:&lt;/p&gt;
  8169.  
  8170. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;n&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;the code should do this&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8171.  &lt;span class=&quot;n&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;UsesAws&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;do_something&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;this_table&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;this_query&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8172. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8173.  
  8174. &lt;p&gt;Our code is completely decoupled from the dependency and we can go ahead and
  8175. test every branch of it without issuing any requests to external actors.&lt;/p&gt;
  8176.  
  8177. &lt;h3 id=&quot;add-our-stub-to-the-compile-path&quot;&gt;Add our stub to the compile path&lt;/h3&gt;
  8178. &lt;p&gt;In our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix.exs&lt;/code&gt; file, we should add a &lt;a href=&quot;https://hexdocs.pm/mix/Mix.Tasks.Compile.Elixir.html#module-configuration&quot;&gt;configuration for the compiler&lt;/a&gt;
  8179. adding the path to where our stub modules are located (for example, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test/lib&lt;/code&gt;):&lt;/p&gt;
  8180.  
  8181. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Mixfile&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8182.  &lt;span class=&quot;kn&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Mix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Project&lt;/span&gt;
  8183.  &lt;span class=&quot;kn&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Logger&lt;/span&gt;
  8184.  
  8185.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;project&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8186.    &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  8187.      &lt;span class=&quot;ss&quot;&gt;app:&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:my_app&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8188.      &lt;span class=&quot;c1&quot;&gt;# A lot of stuff here...&lt;/span&gt;
  8189.      &lt;span class=&quot;ss&quot;&gt;elixirc_paths:&lt;/span&gt;
  8190.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Mix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:test&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8191.          &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;lib&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;test/lib&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  8192.        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
  8193.          &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;lib&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  8194.        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8195.    &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  8196.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8197.  
  8198. &lt;p&gt;So are we done yet? Nah, not even close!&lt;/p&gt;
  8199.  
  8200. &lt;h3 id=&quot;a-good-first-step-still-a-long-way-to-go&quot;&gt;A good first step, still a long way to go&lt;/h3&gt;
  8201. &lt;p&gt;The proposed solution still has a few downsides:&lt;/p&gt;
  8202.  
  8203. &lt;p&gt;1.- We have to add a function clause to our stub for every combination (more
  8204. or less) of arguments to return the needed value.&lt;/p&gt;
  8205.  
  8206. &lt;p&gt;2.- How do we match &lt;em&gt;anything&lt;/em&gt; for some arguments but specific values for others?&lt;/p&gt;
  8207.  
  8208. &lt;p&gt;Let’s see how ETS and match specifications can help us with these issues.&lt;/p&gt;
  8209.  
  8210. &lt;h1 id=&quot;first-improvement-dynamic-matching-of-arguments&quot;&gt;First improvement: Dynamic matching of arguments&lt;/h1&gt;
  8211. &lt;p&gt;Let’s change our stub to support “dynamic configuration”. For every stubbed
  8212. function, we’re going to “record” the needed information in an &lt;a href=&quot;http://erlang.org/doc/man/ets.html&quot;&gt;ETS table&lt;/a&gt;
  8213. and then for every call we’re going to try to find the matching function name
  8214. and arguments from there.&lt;/p&gt;
  8215.  
  8216. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Dynamo&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8217.  &lt;span class=&quot;nv&quot;&gt;@ets&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:dynamo_stub&lt;/span&gt;
  8218.  
  8219.  &lt;span class=&quot;c1&quot;&gt;# We call this one from our tests to setup a specific return&lt;/span&gt;
  8220.  &lt;span class=&quot;c1&quot;&gt;# value for a given function/args. &quot;value&quot; can be a function&lt;/span&gt;
  8221.  &lt;span class=&quot;c1&quot;&gt;# or any other term. If it&apos;s a function, it will be executed&lt;/span&gt;
  8222.  &lt;span class=&quot;c1&quot;&gt;# with the given args and its return value will be returned&lt;/span&gt;
  8223.  &lt;span class=&quot;c1&quot;&gt;# to the caller.&lt;/span&gt;
  8224.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list_of_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8225.    &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;@ets&lt;/span&gt;
  8226.    &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  8227.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8228.  
  8229.  &lt;span class=&quot;c1&quot;&gt;# A stubbed function.&lt;/span&gt;
  8230.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8231.    &lt;span class=&quot;n&quot;&gt;respond!&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  8232.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8233.  
  8234.  &lt;span class=&quot;c1&quot;&gt;# The &quot;magic&quot; happens here. The right combination of function&lt;/span&gt;
  8235.  &lt;span class=&quot;c1&quot;&gt;# name/arguments will be looked up in the ETS. An error is&lt;/span&gt;
  8236.  &lt;span class=&quot;c1&quot;&gt;# raised if none is found. If a function is found as the&lt;/span&gt;
  8237.  &lt;span class=&quot;c1&quot;&gt;# value, it will be executed and its return value will be&lt;/span&gt;
  8238.  &lt;span class=&quot;c1&quot;&gt;# returned to the caller. Otherwise the plain value found&lt;/span&gt;
  8239.  &lt;span class=&quot;c1&quot;&gt;# will be returned.&lt;/span&gt;
  8240.  &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8241.    &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;@ets&lt;/span&gt;
  8242.    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lookup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8243.      &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8244.        &lt;span class=&quot;s2&quot;&gt;&quot;Didn&apos;t find a response for &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;__MODULE__&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;&lt;/span&gt;
  8245.        &lt;span class=&quot;s2&quot;&gt;&quot;with &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;inspect&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  8246.      &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_fun_and_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8247.        &lt;span class=&quot;ss&quot;&gt;:erlang&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8248.      &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
  8249.        &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
  8250.      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8251.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8252. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8253.  
  8254. &lt;p&gt;Then the test code becomes:&lt;/p&gt;
  8255.  
  8256. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;n&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;the code should do this&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8257.  &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Dynamo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8258.    &lt;span class=&quot;ss&quot;&gt;:do_something&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;this_table&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;this_query&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;
  8259.  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8260.  &lt;span class=&quot;n&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;UsesAws&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;do_something&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;this_table&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;this_query&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8261. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8262.  
  8263. &lt;p&gt;And let’s not forget that we need to create the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:dynamo_stub&lt;/code&gt; ETS table, which
  8264. can be done in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test/test_helper.exs&lt;/code&gt;:&lt;/p&gt;
  8265.  
  8266. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;ss&quot;&gt;:dynamo_stub&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:dynamo_stub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:public&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  8267.  
  8268. &lt;span class=&quot;no&quot;&gt;ExUnit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8269.  
  8270. &lt;p&gt;Now there’s no need to create different function clauses per stubbed function. But
  8271. wait, there’s more.&lt;/p&gt;
  8272.  
  8273. &lt;h1 id=&quot;second-improvement-matching-for-any-argument&quot;&gt;Second improvement: Matching for &lt;em&gt;any&lt;/em&gt; argument&lt;/h1&gt;
  8274. &lt;p&gt;So there’s (at least) one thing remaining. How could we match for &lt;em&gt;any&lt;/em&gt; argument? I.e:
  8275. let’s say that we’d like our test to be like:&lt;/p&gt;
  8276.  
  8277. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;n&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;the code should do this&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8278.  
  8279.  &lt;span class=&quot;c1&quot;&gt;# Reply with the same value no matter what the table name is&lt;/span&gt;
  8280.  &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Dynamo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8281.    &lt;span class=&quot;ss&quot;&gt;:do_something&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;this_query&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;
  8282.  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8283.  &lt;span class=&quot;n&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;UsesAws&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;do_something&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;any other table&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;this_query&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8284. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8285.  
  8286. &lt;p&gt;Since we’re already using ETS we can look at a useful erlang feature called
  8287. “match specifications”.&lt;/p&gt;
  8288.  
  8289. &lt;h2 id=&quot;what-are-match-specifications&quot;&gt;What are match specifications?&lt;/h2&gt;
  8290. &lt;p&gt;To quote &lt;a href=&quot;http://erlang.org/doc/apps/erts/match_spec.html&quot;&gt;Match Specifications in erlang.org&lt;/a&gt;:&lt;/p&gt;
  8291.  
  8292. &lt;blockquote&gt;
  8293.  &lt;p&gt;A “match specification” (match_spec) is an Erlang term describing
  8294. a small “program” that tries to match something. It can be used
  8295. to either control tracing with erlang:trace_pattern/3 or to search
  8296. for objects in an ETS table with for example ets:select/2.&lt;/p&gt;
  8297. &lt;/blockquote&gt;
  8298.  
  8299. &lt;p&gt;The main idea is to use &lt;a href=&quot;http://erlang.org/doc/apps/erts/match_spec.html&quot;&gt;Match Specifications&lt;/a&gt; to
  8300. match the called function arguments and return the right value.&lt;/p&gt;
  8301.  
  8302. &lt;p&gt;Match specifications are &lt;em&gt;really&lt;/em&gt; complex, and we’re only going to use them to match the special
  8303. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:_&lt;/code&gt; symbol as a wildcard meaning &lt;em&gt;any&lt;/em&gt; argument.&lt;/p&gt;
  8304.  
  8305. &lt;p&gt;For this we’re going to use &lt;a href=&quot;http://erlang.org/doc/man/ets.html#test_ms-2&quot;&gt;ets:test_ms/2&lt;/a&gt;,
  8306. it accepts a list of match specifications and tests it against a specific term.&lt;/p&gt;
  8307.  
  8308. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;match_pattern&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;term&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;term&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()]}]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8309.  
  8310. &lt;p&gt;Described in &lt;a href=&quot;http://erlang.org/doc/man/ets.html#select-2&quot;&gt;ets:select/2&lt;/a&gt;&lt;/p&gt;
  8311.  
  8312. &lt;blockquote&gt;
  8313.  &lt;p&gt;This means that the match specification is always a list of one
  8314. or more tuples (of arity 3).&lt;/p&gt;
  8315.  
  8316.  &lt;p&gt;The first element of the tuple is to be a pattern as described
  8317. in match/2.&lt;/p&gt;
  8318.  
  8319.  &lt;p&gt;The second element of the tuple is to be a list of 0 or more
  8320. guard tests.&lt;/p&gt;
  8321.  
  8322.  &lt;p&gt;The third element of the tuple is to be a list containing a
  8323. description of the value to return. In almost all normal cases,
  8324. the list contains exactly one term that fully describes the value
  8325. to return for each object.&lt;/p&gt;
  8326.  
  8327.  &lt;p&gt;The return value is constructed using the “match variables” bound
  8328. in MatchHead or using the special match variables ‘$_’ (the whole
  8329. matching object).&lt;/p&gt;
  8330. &lt;/blockquote&gt;
  8331.  
  8332. &lt;h2 id=&quot;testing-match-specifications&quot;&gt;Testing match specifications&lt;/h2&gt;
  8333. &lt;p&gt;Let’s give it a try in the Elixir console. If we want to stub a function with
  8334. 3 arguments, we could use match specs like this:&lt;/p&gt;
  8335.  
  8336. &lt;h3 id=&quot;matching-by-exact-argument-values&quot;&gt;Matching by exact argument values&lt;/h3&gt;
  8337.  
  8338. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test_ms&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8339.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  8340.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:&quot;$_&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}]&lt;/span&gt;
  8341. &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8342. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8343.  
  8344. &lt;p&gt;Meaning that we’d like to match a 3-element tuple with &lt;em&gt;exactly&lt;/em&gt; these arguments. No
  8345. guards applied, and the result in case of a match will be the complete tuple.&lt;/p&gt;
  8346.  
  8347. &lt;h3 id=&quot;matching-for-any-argument&quot;&gt;Matching for &lt;em&gt;any&lt;/em&gt; argument&lt;/h3&gt;
  8348. &lt;p&gt;Let’s say we’d like to only match arguments 1 and 3 and we don’t care about argument 2,
  8349. we could do the following:&lt;/p&gt;
  8350.  
  8351. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;n&quot;&gt;iex&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test_ms&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8352.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  8353.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:&quot;$_&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}]&lt;/span&gt;
  8354. &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8355. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8356.  
  8357. &lt;h3 id=&quot;no-arguments-match&quot;&gt;No arguments match&lt;/h3&gt;
  8358. &lt;p&gt;Cool. Now let’s try the case where we don’t match one or more args:&lt;/p&gt;
  8359.  
  8360. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;n&quot;&gt;iex&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test_ms&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8361.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  8362.  &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:arg3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:&quot;$_&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}]&lt;/span&gt;
  8363. &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8364. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8365.  
  8366. &lt;p&gt;No results, so this works rather well for our use case, actually.&lt;/p&gt;
  8367.  
  8368. &lt;h3 id=&quot;updating-our-stub-to-use-match-specs&quot;&gt;Updating our stub to use match specs&lt;/h3&gt;
  8369. &lt;p&gt;Now that we’ve learned how to use match specs for this, let’s change our stub
  8370. module.&lt;/p&gt;
  8371.  
  8372. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Dynamo&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8373.  &lt;span class=&quot;nv&quot;&gt;@ets&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:dynamo_stub&lt;/span&gt;
  8374.  
  8375.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list_of_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8376.    &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;@ets&lt;/span&gt;
  8377.    &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;
  8378.    &lt;span class=&quot;c1&quot;&gt;# We&apos;re converting the list of arguments into a tuple because&lt;/span&gt;
  8379.    &lt;span class=&quot;c1&quot;&gt;# match specs need to match on tuple elements. We also add the&lt;/span&gt;
  8380.    &lt;span class=&quot;c1&quot;&gt;# returned value as the first elements so we can easily strip&lt;/span&gt;
  8381.    &lt;span class=&quot;c1&quot;&gt;# it later before matching.&lt;/span&gt;
  8382.    &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list_of_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  8383.    &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  8384.    &lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;
  8385.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8386.  
  8387.  &lt;span class=&quot;c1&quot;&gt;# A stubbed function.&lt;/span&gt;
  8388.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8389.    &lt;span class=&quot;n&quot;&gt;respond!&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  8390.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8391.  
  8392.  &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list_of_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8393.    &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;@ets&lt;/span&gt;
  8394.    &lt;span class=&quot;n&quot;&gt;candidate_matches&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tab2list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8395.    &lt;span class=&quot;n&quot;&gt;tuple_to_test&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list_of_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8396.  
  8397.    &lt;span class=&quot;c1&quot;&gt;# Iterate through our ets and :ets.test_ms/2 each candidate&lt;/span&gt;
  8398.    &lt;span class=&quot;c1&quot;&gt;# against this call. Note: stubs with exactly the same&lt;/span&gt;
  8399.    &lt;span class=&quot;c1&quot;&gt;# arguments CAN AND WILL step on each other.&lt;/span&gt;
  8400.    &lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;foldl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8401.      &lt;span class=&quot;c1&quot;&gt;# Convert to a list so we can extract the value to return&lt;/span&gt;
  8402.      &lt;span class=&quot;c1&quot;&gt;# and match only on arguments.&lt;/span&gt;
  8403.      &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8404.        &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;match_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8405.          &lt;span class=&quot;c1&quot;&gt;# Back to tuple and trying to match&lt;/span&gt;
  8406.          &lt;span class=&quot;n&quot;&gt;query_tuple&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;match_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8407.          &lt;span class=&quot;n&quot;&gt;match_spec&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;query_tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:&quot;$_&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  8408.  
  8409.          &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test_ms&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tuple_to_test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;match_spec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8410.            &lt;span class=&quot;c1&quot;&gt;# No results, continue.&lt;/span&gt;
  8411.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;
  8412.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;
  8413.  
  8414.            &lt;span class=&quot;c1&quot;&gt;# Match found, save it and continue.&lt;/span&gt;
  8415.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;match_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  8416.          &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8417.        &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;
  8418.      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8419.  
  8420.      &lt;span class=&quot;c1&quot;&gt;# Skip if this stub is intended for other function names&lt;/span&gt;
  8421.      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;
  8422.    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8423.    &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt;
  8424.    &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;
  8425.    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8426.    &lt;span class=&quot;n&quot;&gt;return!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8427.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8428.  
  8429.  &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;return!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8430.    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8431.      &lt;span class=&quot;c1&quot;&gt;# Return the first match only&lt;/span&gt;
  8432.      &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8433.        &lt;span class=&quot;ss&quot;&gt;:erlang&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8434.      &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
  8435.        &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
  8436.      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8437.  
  8438.      &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8439.        &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8440.          &lt;span class=&quot;s2&quot;&gt;&quot;Didn&apos;t find a response for &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;__MODULE__&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;&lt;/span&gt;
  8441.          &lt;span class=&quot;s2&quot;&gt;&quot;with &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;inspect&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  8442.    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8443.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8444. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8445.  
  8446. &lt;p&gt;This is so much better. We don’t need to add function clauses with fixed
  8447. arguments and return values in our stub module. Also, we can match on some
  8448. but &lt;em&gt;not all&lt;/em&gt; arguments as needed. But there’s still something else we can do.&lt;/p&gt;
  8449.  
  8450. &lt;h1 id=&quot;third-improvement-return-the-best-match-less-generic-match&quot;&gt;Third improvement: Return the &lt;em&gt;best&lt;/em&gt; match (&lt;em&gt;less generic&lt;/em&gt; match)&lt;/h1&gt;
  8451. &lt;p&gt;What if a test setups our stub to return a value when called with the args
  8452. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:arg1, :arg2, :arg3&lt;/code&gt;, while at the same another one sets it to return a
  8453. different value for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:arg1, :arg2, :_&lt;/code&gt;?&lt;/p&gt;
  8454.  
  8455. &lt;p&gt;We have a conflict there, we won’t know in advance which value is the right
  8456. one to return to the caller since both configurations will match.&lt;/p&gt;
  8457.  
  8458. &lt;p&gt;In cases like this, we’d like to always consistently return the result from the
  8459. stub that best matches (i.e: a specific argument value should be preferred over
  8460. a generic match like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:_&lt;/code&gt;).&lt;/p&gt;
  8461.  
  8462. &lt;p&gt;An easy solution would be to sort all the matches from the match specs and
  8463. use the one that has fewer &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:_&lt;/code&gt; in it.&lt;/p&gt;
  8464.  
  8465. &lt;p&gt;So one more time we go to our stub module:&lt;/p&gt;
  8466.  
  8467. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Dynamo&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8468.  &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  8469.  
  8470.  &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list_of_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8471.    &lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# .. everything is still the same, BUT now we do:&lt;/span&gt;
  8472.  
  8473.    &lt;span class=&quot;n&quot;&gt;sorted_matches&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sort_by_less_generic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8474.    &lt;span class=&quot;n&quot;&gt;return!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list_of_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sorted_matches&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8475.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8476.  
  8477.  &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sort_by_less_generic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8478.    &lt;span class=&quot;c1&quot;&gt;# Pick the one with less :&quot;_&quot; in the args (should be&lt;/span&gt;
  8479.    &lt;span class=&quot;c1&quot;&gt;# the more specific match)&lt;/span&gt;
  8480.    &lt;span class=&quot;n&quot;&gt;sorted_matches&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
  8481.      &lt;span class=&quot;no&quot;&gt;Enum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8482.        &lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8483.        &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_a_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_b_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8484.          &lt;span class=&quot;n&quot;&gt;al&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a_args&lt;/span&gt;
  8485.          &lt;span class=&quot;n&quot;&gt;bl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b_args&lt;/span&gt;
  8486.          &lt;span class=&quot;n&quot;&gt;count_a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Enum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;al&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;===&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:_&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8487.          &lt;span class=&quot;n&quot;&gt;count_b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Enum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;===&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:_&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8488.          &lt;span class=&quot;n&quot;&gt;count_a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count_b&lt;/span&gt;
  8489.        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8490.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8491. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8492.  
  8493. &lt;h1 id=&quot;fourth-improvement-generic-base-stub-module-as-a-macro&quot;&gt;Fourth improvement: Generic base stub module as a macro&lt;/h1&gt;
  8494. &lt;p&gt;As a “last” improvement (at least in terms of this post) we can wrap up our
  8495. work by creating a generic stub module that can be reused in different mocks
  8496. for our tests. We can do that by embedding our code inside a &lt;a href=&quot;https://elixir-lang.org/getting-started/meta/macros.html&quot;&gt;macro&lt;/a&gt;
  8497. that can then be &lt;a href=&quot;https://elixir-lang.org/getting-started/alias-require-and-import.html#use&quot;&gt;used&lt;/a&gt; in the different stubs:&lt;/p&gt;
  8498.  
  8499. &lt;h2 id=&quot;create-an-ets-table-per-stub&quot;&gt;Create an ETS table per stub&lt;/h2&gt;
  8500. &lt;p&gt;Since we’re going to have multiple module stubs we need an ETS table
  8501. for each one of them, let’s set them up in&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test/test_helper.exs&lt;/code&gt;:&lt;/p&gt;
  8502.  
  8503. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;n&quot;&gt;stubs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  8504.  &lt;span class=&quot;ss&quot;&gt;dynamodb_backend:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Dynamo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8505.  &lt;span class=&quot;ss&quot;&gt;another_one:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Another&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8506.  &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  8507. &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  8508.  
  8509. &lt;span class=&quot;c1&quot;&gt;# Create the tables, saving the table names in our&lt;/span&gt;
  8510. &lt;span class=&quot;c1&quot;&gt;# application environment so the stubs can get them&lt;/span&gt;
  8511. &lt;span class=&quot;c1&quot;&gt;# later, when needed.&lt;/span&gt;
  8512. &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
  8513.  &lt;span class=&quot;n&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stubs&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8514.    &lt;span class=&quot;n&quot;&gt;ets_table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mod&lt;/span&gt;
  8515.  
  8516.    &lt;span class=&quot;no&quot;&gt;Application&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;put_env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8517.      &lt;span class=&quot;ss&quot;&gt;:my_app&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8518.      &lt;span class=&quot;n&quot;&gt;ets_table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8519.      &lt;span class=&quot;ss&quot;&gt;:ets&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ets_table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  8520.        &lt;span class=&quot;ss&quot;&gt;:public&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8521.        &lt;span class=&quot;c1&quot;&gt;# We need a duplicate bag here because we&apos;re&lt;/span&gt;
  8522.        &lt;span class=&quot;c1&quot;&gt;# going to use the function name as a key&lt;/span&gt;
  8523.        &lt;span class=&quot;c1&quot;&gt;# (instead of fun_name + args).&lt;/span&gt;
  8524.        &lt;span class=&quot;ss&quot;&gt;:duplicate_bag&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8525.        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:read_concurrency&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  8526.        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:write_concurrency&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  8527.      &lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  8528.    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8529.  
  8530.    &lt;span class=&quot;c1&quot;&gt;# Save the ets table name in the application&lt;/span&gt;
  8531.    &lt;span class=&quot;c1&quot;&gt;# environment so the stub module used can pick&lt;/span&gt;
  8532.    &lt;span class=&quot;c1&quot;&gt;# it up from there.&lt;/span&gt;
  8533.    &lt;span class=&quot;no&quot;&gt;Application&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;put_env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:my_app&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8534.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8535.  
  8536. &lt;h2 id=&quot;move-the-code-into-a-generic-stub-module&quot;&gt;Move the code into a generic stub module&lt;/h2&gt;
  8537. &lt;p&gt;The final code is pretty much the same, except that now we get the ETS table
  8538. from the application environment:&lt;/p&gt;
  8539.  
  8540. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Base&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8541.  &lt;span class=&quot;k&quot;&gt;defmacro&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__using__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_opts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8542.    &lt;span class=&quot;kn&quot;&gt;quote&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;location:&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:keep&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8543.  
  8544.      &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list_of_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8545.        &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  8546.        &lt;span class=&quot;c1&quot;&gt;# ..same as before...&lt;/span&gt;
  8547.      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8548.  
  8549.      &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8550.        &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  8551.        &lt;span class=&quot;c1&quot;&gt;# ..same as before...&lt;/span&gt;
  8552.      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8553.  
  8554.      &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sort_by_less_generic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8555.        &lt;span class=&quot;c1&quot;&gt;# ..same as before...&lt;/span&gt;
  8556.      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8557.  
  8558.      &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;return!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;matches&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8559.        &lt;span class=&quot;c1&quot;&gt;# ..same as before...&lt;/span&gt;
  8560.      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8561.  
  8562.      &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8563.        &lt;span class=&quot;no&quot;&gt;Application&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:my_app&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;__MODULE__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8564.      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8565.    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8566.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8567. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8568.  
  8569. &lt;p&gt;Now our original stub will look like this:&lt;/p&gt;
  8570.  
  8571. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elixir&quot; data-lang=&quot;elixir&quot;&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Dynamo&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8572.  &lt;span class=&quot;nv&quot;&gt;@moduledoc&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;false&lt;/span&gt;
  8573.  &lt;span class=&quot;kn&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Test&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Stub&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Base&lt;/span&gt;
  8574.  
  8575.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  8576.    &lt;span class=&quot;n&quot;&gt;respond!&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:get_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  8577.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8578. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8579.  
  8580. &lt;p&gt;And that’s pretty much it. Hopefully this will help you write more and better
  8581. tests for your Elixir/Erlang code. Until next time :)&lt;/p&gt;
  8582.  
  8583. &lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
  8584. &lt;p&gt;As a final word, it’s important to notice &lt;em&gt;how much we gained&lt;/em&gt; by just investing
  8585. 20 minutes more on refactoring our code to make it more testable, and to allow
  8586. us to write better tests. By writing a few stubs and helpers, this allowed us to
  8587. avoid mocking global modules, and in the end this will be really worth our while,
  8588. our tests will run faster, will be smaller, and less complex.&lt;/p&gt;
  8589.  
  8590. &lt;hr /&gt;
  8591.  
  8592. &lt;p&gt;&lt;strong&gt;Do you enjoy building high-quality large-scale systems? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  8593.  
  8594. </description>
  8595.    </item>
  8596.    
  8597.    
  8598.    
  8599.    <item>
  8600.      <title>Series: Tech Women of AdRoll Group</title>
  8601.      <link>https://tech.nextroll.com/blog/culture/2018/03/22/tech-women-of-adroll-group-part-3.html</link>
  8602.      <pubDate>Thu, 22 Mar 2018 00:00:00 -0700</pubDate>
  8603.      <author></author>
  8604.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2018/03/22/tech-women-of-adroll-group-part-3</guid>
  8605.      <description>&lt;p&gt;In the spirit of &lt;a href=&quot;https://womenshistorymonth.gov/&quot;&gt;National Women’s History Month&lt;/a&gt;, we are publishing our third volume of our &lt;a href=&quot;https://blog.adroll.com/tag/tech-women-of-adroll/&quot;&gt;Tech Women of AdRoll Group series&lt;/a&gt; to celebrate and honor women across AdRoll Group (BI, Engineering, Product Management), and to acknowledge their diversity of backgrounds, viewpoints, and experiences. We had Nitasha Syed, creator and writer of the blog series Women of Stem, sit down with the women of AdRoll Group to better understand their backgrounds and what led them to careers in STEM. These women are engineers, accountants, humanitarians, rock climbers and even tattoo artists, who all found their way into STEM careers. Read below to discover their unique and inspirational stories.&lt;/p&gt;
  8606.  
  8607. &lt;p&gt;Also, please &lt;a href=&quot;https://www.eventbrite.com/e/how-to-vocalize-your-achievements-a-free-workshop-for-underrepresented-women-in-tech-tickets-44244068199&quot;&gt;join us&lt;/a&gt; next Tuesday, March 27th @5:30pm for a free workshop we are hosting in partnership with &lt;a href=&quot;https://techqueria.org/&quot;&gt;Techqueria&lt;/a&gt; on “How to Vocalize Your Achievements: A Free Workshop for Underrepresented Women in Tech” led by the amazing Career Success Coach Maria Eleanora.&lt;/p&gt;
  8608.  
  8609. &lt;p&gt;&lt;img alt=&quot;Joyce&quot; src=&quot;/images/post_images/tech-women-joyce.jpg&quot; style=&quot;max-width: 375px&quot; /&gt;&lt;/p&gt;
  8610.  
  8611. &lt;h2 id=&quot;joyce-huynh&quot;&gt;Joyce Huynh&lt;/h2&gt;
  8612.  
  8613. &lt;p&gt;&lt;strong&gt;Sr. Product Manager, Reporting&lt;/strong&gt;&lt;/p&gt;
  8614.  
  8615. &lt;p&gt;“I wasn’t sure what I wanted to be when I grew up, but my family wanted me to be a doctor. When I was in high school I took chemistry and mathematics and found both subjects very interesting. I decided to study chemical engineering, and got my bachelors from Berkeley and my Ph.D. from Caltech. During my Ph.D. studies, I discovered working in a lab was too isolating and not something I wanted to pursue as a career.&lt;/p&gt;
  8616.  
  8617. &lt;p&gt;Switching out of science made finding my first job challenging. Despite having to switch fields, my STEM education background enabled me to get my first job as a Data Analyst for a consulting company. After being in consulting for a little over a year, I discovered I wanted more ownership and direct impact from my work. During this time, I learned about and became interested in the Product Management role and decided to make a switch. It’s because of my STEM background, I was able to explore different roles. A STEM education trains you to think critically and focuses on solving problems, which are valuable skills to have regardless of the role.&lt;/p&gt;
  8618.  
  8619. &lt;p&gt;My mom has always been a source of inspiration for me, and pushed me to do my best in everything. I would like to advise parents to make sure they encourage their kids to be whatever they want to be. If your kids have an interest in STEM, help them learn how to think through problems from a young age. When I was younger, we would go to the grocery store and I would calculate the $/ounce and compare it to other options in the store to determine whether we were getting the best deal. Problem solving opportunities are all around us and you just need to push your children to explore them.”&lt;/p&gt;
  8620.  
  8621. &lt;p&gt;&lt;img alt=&quot;Prathibha&quot; src=&quot;/images/post_images/tech-women-prathibha.jpg&quot; style=&quot;max-width: 375px&quot; /&gt;&lt;/p&gt;
  8622.  
  8623. &lt;h2 id=&quot;prathibha-deshikachar&quot;&gt;Prathibha Deshikachar&lt;/h2&gt;
  8624.  
  8625. &lt;p&gt;&lt;strong&gt;Director, Software Engineering&lt;/strong&gt;&lt;/p&gt;
  8626.  
  8627. &lt;p&gt;“A career in science and math was a natural progression from high school, as I grew up surrounded by engineers and scientists. My dad was a math professor at an engineering college, so math made its way to our dining room conversations. I never thought of it as something special or something I was not allowed to seek. It just seemed very natural at that time. When I entered engineering school, I realized that there were fewer girls. The gender disparity only seemed much larger after I started working. I loved the challenges of solving a problem and the sense of accomplishment thereafter. The foundation to this was laid when I was very young. I was always taught to believe in my abilities and constantly challenge myself. This is important for parents today. There are many ways to solve a problem and kids should be encouraged to explore it on their own. It is ok to be frustrated and failure should be considered a stepping stone to success. A career in STEM takes a lot of perseverance, so it is necessary to lay that foundation when kids are young.”&lt;/p&gt;
  8628.  
  8629. &lt;hr /&gt;
  8630.  
  8631. &lt;p&gt;Thanks for reading! If you would like to learn more about our tech culture and tech team, check us out at &lt;a href=&quot;http://tech.adroll.com&quot;&gt;tech.adroll.com&lt;/a&gt;. If you would like to learn about all current open roles by location, check out our &lt;a href=&quot;http://www.adroll.com/about/careers/open-positions&quot;&gt;job board&lt;/a&gt;.&lt;/p&gt;
  8632. </description>
  8633.    </item>
  8634.    
  8635.    
  8636.    
  8637.    <item>
  8638.      <title>
  8639. Testing at warp speed: Why you should care about your test speed
  8640. </title>
  8641.      <link>https://tech.nextroll.com/blog/dev/2018/02/27/erlang-speed-tests.html</link>
  8642.      <pubDate>Tue, 27 Feb 2018 00:00:00 -0800</pubDate>
  8643.      <author></author>
  8644.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2018/02/27/erlang-speed-tests</guid>
  8645.      <description>&lt;p&gt;We reduced our test run times by more than 50%, thus speeding up development and deployment considerably with the application of some generic tips that you can use in your system as well.&lt;/p&gt;
  8646.  
  8647. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;15-20 minute read&lt;/code&gt;&lt;/p&gt;
  8648.  
  8649. &lt;hr /&gt;
  8650.  
  8651. &lt;p&gt;As you can read in &lt;a href=&quot;/blog/dev/2018/01/08/quaff-that-potion-saving-millions-with-elixir-and-erlang.html&quot;&gt;our previous blog post&lt;/a&gt;, at AdRoll we use &lt;a href=&quot;http://www.erlang.org/&quot;&gt;Erlang/OTP&lt;/a&gt; extensively. In particular, we use it to build our &lt;a href=&quot;/blog/web/2014/04/29/valentino-presents-adrolls-rtb-infrastructure.html&quot;&gt;real-time bidding platform&lt;/a&gt;.&lt;/p&gt;
  8652.  
  8653. &lt;p&gt;These systems are central to our business and that’s why we’re very rigorous about writing tests and making sure they all run smoothly in every pull request and before every deploy.&lt;/p&gt;
  8654.  
  8655. &lt;p&gt;We have a fairly complex test structure which has proven useful in keeping our system maintainable. But for these tests to be truly helpful they need to run quickly because we run them all the time. Balancing test completeness and test speed is not an easy task.&lt;/p&gt;
  8656.  
  8657. &lt;p&gt;In our effort to keep that balance, we have discovered quite a few things. This article describes the lessons we have learned ensuring our tests run fast enough to allow developers to work at top speed without compromising the quality of our software.&lt;/p&gt;
  8658.  
  8659. &lt;h1 id=&quot;why-do-you-need-to-keep-your-tests-running-fast&quot;&gt;Why do you need to keep your tests running fast?&lt;/h1&gt;
  8660. &lt;p&gt;It’s not uncommon to find projects where a single test run could last well over 10 minutes. Since those test runs tend to grow steadily over time and the systems they’re testing are also huge, even when everybody knows their tests are &lt;em&gt;slow&lt;/em&gt;, nobody notices &lt;em&gt;how slow&lt;/em&gt; they are until it becomes critical or until a new developer comes along and has to wait forever for tests to complete for the first time. But test runs lasting more than a few minutes have serious consequences…&lt;/p&gt;
  8661.  
  8662. &lt;h2 id=&quot;on-development&quot;&gt;On Development&lt;/h2&gt;
  8663. &lt;p&gt;How many test rounds do you need to develop a new feature or fix a bug? If you practice &lt;a href=&quot;https://en.wikipedia.org/wiki/Test-driven_development&quot;&gt;Test Driven Development&lt;/a&gt; you need at least three (the initial red one, the first green one after implementing your code and the second green one after refactoring). But even if you don’t practice TDD, you will probably still want to test the code you write before submitting a pull request.&lt;/p&gt;
  8664.  
  8665. &lt;p&gt;Three test runs, several minutes each, means a lot of time spent just waiting for tests to run. And more often than not while you wait for tests to run you can’t do much else. Even if you start working on something else in parallel while the tests run, &lt;a href=&quot;https://www.linkedin.com/pulse/context-switching-developers-paul-graham/&quot;&gt;context switching&lt;/a&gt; is generally not good for developers. Besides, your new task will most likely also require running tests.&lt;/p&gt;
  8666.  
  8667. &lt;p&gt;On top of that, if you have configured your &lt;a href=&quot;https://www.thoughtworks.com/continuous-integration&quot;&gt;Continuous Integration&lt;/a&gt; tool as we have (we run a full round of tests for each pull request and for each branch merged into master), then merging even the smallest PR would take at least two full test runs – sometimes many more depending on how many PRs are open at the same time and how fast or parallelizable your CI is.&lt;/p&gt;
  8668.  
  8669. &lt;p&gt;These scenarios lead to two opposite approaches to dealing with long-running tests:&lt;/p&gt;
  8670.  
  8671. &lt;ol&gt;
  8672.  &lt;li&gt;&lt;strong&gt;Being overly careful with your code modifications.&lt;/strong&gt; Not touching anything that is not 100% related to what you’re implementing for fear of making a passing test fail. Behaving like this slows down your refactoring process considerably, affecting the overall code quality of the system.&lt;/li&gt;
  8673.  &lt;li&gt;&lt;strong&gt;Including many unrelated changes in a single PR.&lt;/strong&gt; The goal being to run CI tests &lt;em&gt;just once&lt;/em&gt; and avoid waiting for many PRs to have green lights. This affects your ability to properly review, understand, and revert your changes if needed.&lt;/li&gt;
  8674. &lt;/ol&gt;
  8675.  
  8676. &lt;p&gt;Code reviews are also affected since reviewers will be more wary of requesting changes if those are not explicit merge-blockers. That’s because a change request might require the author to spend another hour running tests on their machine and then CI spending the same amount of time running the tests again. Reducing code reviews to just the most blatant problems affects the culture of your team and also the overall coherence of your code, drastically reducing maintainability.&lt;/p&gt;
  8677.  
  8678. &lt;h2 id=&quot;on-deployment&quot;&gt;On Deployment&lt;/h2&gt;
  8679. &lt;p&gt;We have a general policy of deploying often so that in case a deployment impacts our system performance we can detect the source of the issue quickly. The idea is that since we don’t spend too much time between deploys, we are not introducing too many changes and therefore detecting the one that’s causing problems is easier. You might find yourself in a similar situation.&lt;/p&gt;
  8680.  
  8681. &lt;p&gt;But before each deploy, you want to make sure what you’re deploying actually works as expected. To achieve that, you have to include at least one full test run in your deployment process. In other words, to deploy something you would at least need time for a full test run.&lt;/p&gt;
  8682.  
  8683. &lt;p&gt;That process being slow can also make people overly cautious with deploys, and reduce the available timeframe for them since nobody would want to deploy anything during the last three or four hours of their day. What if the deploy goes south and you have to redeploy what you had before (i.e. rollback)? What if you have to deploy a patch? That may very well require multiple hours of work because you will need time to run tests on your computer, then CI on the corresponding PR, then your build system for the deploy. And that’s assuming there are no comments or failing tests in between and that you can find and fix the bug almost immediately.&lt;/p&gt;
  8684.  
  8685. &lt;p&gt;In a nutshell, it’s not good to have long test runs. Keeping your test run times in check is very important for your system and your team. With that in mind, let me show you what we’ve learned so far…&lt;/p&gt;
  8686.  
  8687. &lt;h1 id=&quot;testing-adrolls-rtb-servers&quot;&gt;Testing AdRoll’s RTB servers&lt;/h1&gt;
  8688.  
  8689. &lt;p&gt;In our system, we have multiple levels of testing:&lt;/p&gt;
  8690.  
  8691. &lt;ul&gt;
  8692.  &lt;li&gt;we write &lt;strong&gt;unit tests&lt;/strong&gt; for our modules, with &lt;a href=&quot;http://erlang.org/doc/apps/eunit/&quot;&gt;eUnit&lt;/a&gt;.&lt;/li&gt;
  8693.  &lt;li&gt;we write &lt;strong&gt;integration tests&lt;/strong&gt; with &lt;a href=&quot;http://erlang.org/doc/apps/common_test/&quot;&gt;Common Test&lt;/a&gt;.&lt;/li&gt;
  8694.  &lt;li&gt;we have our own tool to run &lt;strong&gt;black-box tests&lt;/strong&gt;: we run client simulations against one of our servers, which are also included in a &lt;a href=&quot;http://erlang.org/doc/apps/common_test/&quot;&gt;Common Test&lt;/a&gt; suite.&lt;/li&gt;
  8695. &lt;/ul&gt;
  8696.  
  8697. &lt;p&gt;For most of our &lt;strong&gt;integration tests&lt;/strong&gt; we need to mock one or more underlying pieces (particularly those that manage connections with databases, download files from the internet or connect with external services) and for that, we use &lt;a href=&quot;https://hex.pm/packages/meck&quot;&gt;meck&lt;/a&gt;.&lt;/p&gt;
  8698.  
  8699. &lt;p&gt;A full test run would run all of the aforementioned tests, starting from the unit tests, all the way through the integration ones and finally the black box suites. For this article, we’ll forget about &lt;strong&gt;unit tests&lt;/strong&gt; (which are, predictably, the fastest ones anyway) and we’ll focus on those we run through &lt;strong&gt;Common Test&lt;/strong&gt;.&lt;/p&gt;
  8700.  
  8701. &lt;h1 id=&quot;benchmarking&quot;&gt;Benchmarking&lt;/h1&gt;
  8702.  
  8703. &lt;p&gt;When trying to improve test speed, the first step is always benchmarking your test runs to find which tests are taking longest. Luckily, &lt;strong&gt;Common Test&lt;/strong&gt; is a very complete and flexible framework that allows us to define hooks. Using &lt;a href=&quot;http://erlang.org/doc/man/ct_hooks.html&quot;&gt;ct_hooks&lt;/a&gt;, we implemented our own hook module. Actually, we implemented two of them: One for regular test runs and one for benchmarking. The code within the benchmarking module looks like…&lt;/p&gt;
  8704.  
  8705. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;c&quot;&gt;%%% @doc Common Test Hook module for benchmarking.
  8706. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timer_cth&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  8707.  
  8708. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;export&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  8709. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;export&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pre_init_per_suite&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  8710. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;export&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;post_init_per_suite&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  8711. &lt;span class=&quot;c&quot;&gt;%% ...and all the other required callbacks
  8712. &lt;/span&gt;
  8713. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  8714.  &lt;span class=&quot;n&quot;&gt;time_init_suite&lt;/span&gt;     &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;pos_integer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
  8715.  &lt;span class=&quot;n&quot;&gt;time_end_suite&lt;/span&gt;      &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;pos_integer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
  8716.  &lt;span class=&quot;n&quot;&gt;time_init_testcase&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;pos_integer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
  8717.  &lt;span class=&quot;n&quot;&gt;time_end_testcase&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;pos_integer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
  8718.  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  8719. &lt;span class=&quot;p&quot;&gt;}).&lt;/span&gt;
  8720.  
  8721. &lt;span class=&quot;nf&quot;&gt;init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Opts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{}}.&lt;/span&gt;
  8722.  
  8723. &lt;span class=&quot;c&quot;&gt;%% @doc Called before init_per_suite is called.
  8724. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;pre_init_per_suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8725.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  8726.    &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8727.    &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  8728.      &lt;span class=&quot;n&quot;&gt;suite&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8729.      &lt;span class=&quot;n&quot;&gt;time_init_suite&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  8730.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  8731.  &lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;
  8732.  
  8733. &lt;span class=&quot;c&quot;&gt;%% @doc Called after init_per_suite.
  8734. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;post_init_per_suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8735.  &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8736.    &lt;span class=&quot;s&quot;&gt;&quot;CT-TIME | &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~8.3f&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; | &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;:init_per_suite&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8737.    &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;time_diff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state.time_init_suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  8738.  &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8739.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;
  8740.  
  8741. &lt;span class=&quot;nf&quot;&gt;time_diff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8742.    &lt;span class=&quot;nn&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;now_diff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8743.  
  8744. &lt;p&gt;With functions as simple as the ones shown above, you can measure (and print) how much time is spent in the different stages of each test suite and test case. And if you use easily greppable strings (in this case &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CT-TIME&lt;/code&gt;) you can pipe your test run into grep and extract just the info you need for future analysis.&lt;/p&gt;
  8745.  
  8746. &lt;blockquote&gt;
  8747.  &lt;h5 id=&quot;lesson-learned&quot;&gt;&lt;strong&gt;Lesson Learned&lt;/strong&gt;&lt;/h5&gt;
  8748.  &lt;p&gt;Use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ct_hooks&lt;/code&gt; to benchmark your tests and remember you can have multiple &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ct_hooks&lt;/code&gt; modules and use the benchmarking one only when you’re interested in the performance of your tests.&lt;/p&gt;
  8749. &lt;/blockquote&gt;
  8750.  
  8751. &lt;h1 id=&quot;mocking&quot;&gt;Mocking&lt;/h1&gt;
  8752.  
  8753. &lt;p&gt;When you write integration tests, it’s not uncommon to require a fair share of mocking. In Erlang, that’s usually accomplished using &lt;a href=&quot;https://hex.pm/packages/meck&quot;&gt;meck&lt;/a&gt;. &lt;strong&gt;Meck&lt;/strong&gt; is a very flexible mocking library that allows you to temporarily replace existing modules with a new implementation for some or all of their functions. The usual lifecycle of a mock built with &lt;strong&gt;Meck&lt;/strong&gt; looks something like this…&lt;/p&gt;
  8754.  
  8755. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  8756. &lt;span class=&quot;c&quot;&gt;% initialize your mock
  8757. &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;passthrough&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  8758. &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;expect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8759.  &lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8760.  &lt;span class=&quot;n&quot;&gt;a_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8761.  &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;return_something_trivial&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8762. &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8763.  
  8764. &lt;span class=&quot;c&quot;&gt;% run your test using the mock...
  8765. &lt;/span&gt;
  8766. &lt;span class=&quot;c&quot;&gt;% maybe check stuff with meck, for instance...
  8767. &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;num_calls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;some_expected_arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  8768.  
  8769. &lt;span class=&quot;c&quot;&gt;% destroy the mock
  8770. &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;unload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8771. &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8772.  
  8773. &lt;p&gt;But if you write exactly that code in your test cases you may run into trouble. If your test fails before calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;meck:unload/1&lt;/code&gt;, that module will remain mocked and that may affect other test cases. Depending on how you have configured your test suites, &lt;strong&gt;Common Test&lt;/strong&gt; sometimes boots up different processes for different tests and sometimes it reuses the same process. By default, &lt;strong&gt;Meck&lt;/strong&gt; links mocks to their creating processes, so if your process crashes or stops the mock is removed. But, if &lt;strong&gt;Common Test&lt;/strong&gt; uses the same process to run the next test case, the module will remain mocked and that’s not good.&lt;/p&gt;
  8774.  
  8775. &lt;p&gt;That’s why you usually want to put mocking initialization and unloading outside your main test case function and you can do that with &lt;strong&gt;Common Test&lt;/strong&gt;:&lt;/p&gt;
  8776.  
  8777. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;nf&quot;&gt;init_per_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;the_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8778.  &lt;span class=&quot;c&quot;&gt;% initialize your mock
  8779. &lt;/span&gt;  &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;passthrough&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  8780.  &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8781.  
  8782. &lt;span class=&quot;nf&quot;&gt;the_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8783.  &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;expect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8784.    &lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8785.    &lt;span class=&quot;n&quot;&gt;a_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8786.    &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;return_something_trivial&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8787.  &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8788.  
  8789.  &lt;span class=&quot;c&quot;&gt;% run your test using the mock...
  8790. &lt;/span&gt;
  8791.  &lt;span class=&quot;c&quot;&gt;% maybe check stuff with meck, for instance...
  8792. &lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;num_calls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;some_expected_arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  8793.  &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8794.  
  8795. &lt;span class=&quot;nf&quot;&gt;end_per_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;the_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8796.  &lt;span class=&quot;c&quot;&gt;% destroy the mock
  8797. &lt;/span&gt;  &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;unload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8798.  &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8799.  
  8800. &lt;p&gt;This is also very handy if you want to mock the same module in all your tests. You just need to replace &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init_per_testcase&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;end_per_testcase&lt;/code&gt; function clause heads with something like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init_per_testcase(_TestCase, Config)&lt;/code&gt; et voilà!&lt;/p&gt;
  8801.  
  8802. &lt;p&gt;But… What about performance? Now that you are mocking the same modules over and over again, it is worth asking yourself if unloading and recreating them between each pair of test cases is affecting the overall time of your test runs or not.&lt;/p&gt;
  8803.  
  8804. &lt;p&gt;Turns out, it is. &lt;strong&gt;Meck&lt;/strong&gt; dynamically compiles a new module each time you create a mock and then it boots up a new &lt;a href=&quot;http://erlang.org/doc/design_principles/gen_server_concepts.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gen_server&lt;/code&gt;&lt;/a&gt; for it (to track the history of calls to its functions, among other things). Compiling Erlang code is something that consumes several milliseconds per module (maybe even seconds, if your modules are long enough).
  8805. Well then, we can just create the mocks at the beginning of our suite and unload them at the end. But if we do so, how do we avoid the issue mentioned above (namely, that modules/functions remain mocked between test cases). This is the solution we are currently using:&lt;/p&gt;
  8806.  
  8807. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;c&quot;&gt;%% We initialize our mocks for the whole suite here.
  8808. %% Notice the no_link: we want mocks to stay with us until the end of the
  8809. %% suite, regardless of how many processes common_test wants to use.
  8810. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;init_per_suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8811.  &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;module2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;passthrough&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;no_link&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  8812.  
  8813. &lt;span class=&quot;c&quot;&gt;%% We don&apos;t do much here except for those functions
  8814. %% that we want to always mock in the same way (e.g. DB connections).
  8815. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;init_per_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;TestCase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8816.  &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;expect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;function1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8817.  &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8818.  
  8819. &lt;span class=&quot;c&quot;&gt;%% Our test cases remain the same
  8820. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;the_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8821.  &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;expect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8822.    &lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8823.    &lt;span class=&quot;n&quot;&gt;a_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8824.    &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;return_something_trivial&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  8825.  &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8826.  
  8827.  &lt;span class=&quot;c&quot;&gt;% run your test using the mock...
  8828. &lt;/span&gt;
  8829.  &lt;span class=&quot;c&quot;&gt;% maybe check stuff with meck, for instance...
  8830. &lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;num_calls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;some_expected_arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  8831.  &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8832.  
  8833. &lt;span class=&quot;c&quot;&gt;%% We remove all expects here so that modules, even when mocked,
  8834. %% are basically _clean_ (i.e. they work as if they weren&apos;t mocked at all)
  8835. %% That&apos;s thanks to the passthrough option we provided on init_per_suite
  8836. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end_per_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;TestCase_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8837.  &lt;span class=&quot;nv&quot;&gt;Modules&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;module2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
  8838.  &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Modules&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8839.  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Arity&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  8840.  &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Arity&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;expects&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Modules&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;
  8841.  &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8842.  
  8843. &lt;span class=&quot;c&quot;&gt;%% we do destroy all mocks here
  8844. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end_per_suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8845.  &lt;span class=&quot;nn&quot;&gt;meck&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;unload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;module2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8846.  
  8847. &lt;p&gt;A few things to notice:&lt;/p&gt;
  8848.  
  8849. &lt;ul&gt;
  8850.  &lt;li&gt;&lt;strong&gt;Meck&lt;/strong&gt; doesn’t have a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;delete_all&lt;/code&gt; function. That’s why we have a list comprehension in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;end_per_testcase&lt;/code&gt;.&lt;/li&gt;
  8851.  &lt;li&gt;We need to clean up module statistics as well as the expects; that’s why we’re calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;meck:reset/1&lt;/code&gt;.&lt;/li&gt;
  8852.  &lt;li&gt;Each call to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;meck:delete/4&lt;/code&gt; is not cheap, but it’s way cheaper than &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;meck:new/2&lt;/code&gt;. We only want to call it for the functions that we are &lt;em&gt;actually&lt;/em&gt; mocking, that’s why we use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;meck:expects(Modules, true)&lt;/code&gt; instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;meck:expects(Modules)&lt;/code&gt; (which would include the passthrough functions) or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Module:module_info(exports)&lt;/code&gt; (which would include all the functions in the module).&lt;/li&gt;
  8853. &lt;/ul&gt;
  8854.  
  8855. &lt;blockquote&gt;
  8856.  &lt;h5 id=&quot;lesson-learned-1&quot;&gt;&lt;strong&gt;Lesson Learned&lt;/strong&gt;&lt;/h5&gt;
  8857.  &lt;p&gt;If you’re going to reuse mocks on all your tests, create them on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init_per_suite&lt;/code&gt;, delete the expectations on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;end_per_testcase&lt;/code&gt; and destroy them on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;end_per_suite&lt;/code&gt;.&lt;/p&gt;
  8858. &lt;/blockquote&gt;
  8859.  
  8860. &lt;h1 id=&quot;ct-comments&quot;&gt;CT Comments&lt;/h1&gt;
  8861.  
  8862. &lt;p&gt;&lt;strong&gt;Common Test&lt;/strong&gt; comes with more than a few little known secrets. One of my favorites is &lt;a href=&quot;http://erlang.org/doc/man/ct.html#comment-1&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ct:comment/1,2&lt;/code&gt;&lt;/a&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{comment, &quot;&quot;}&lt;/code&gt;. They let you add some context to your tests that end up in the generated HTML summary of your test runs. This lets you understand where a test failed at a glance.
  8863. For instance, consider this test:&lt;/p&gt;
  8864.  
  8865. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;nf&quot;&gt;failing_test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8866.  &lt;span class=&quot;nn&quot;&gt;ct&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;comment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Dividing by zero using div should not fail&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8867.  &lt;span class=&quot;n&quot;&gt;infinity&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;div&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8868.  
  8869.  &lt;span class=&quot;nn&quot;&gt;ct&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;comment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Dividing by zero should not fail&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8870.  &lt;span class=&quot;n&quot;&gt;infinity&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8871.  
  8872.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;comment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8873.  
  8874. &lt;p&gt;That test will obviously fail and thanks to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ct:comment/1&lt;/code&gt; you will get something like this in your test report:&lt;/p&gt;
  8875.  
  8876. &lt;p&gt;&lt;img src=&quot;/images/post_images/erlang_tests/ct_comment.png&quot; alt=&quot;CT Comment in Test Report&quot; /&gt;&lt;/p&gt;
  8877.  
  8878. &lt;p&gt;It’s a really nice feature but, as you might be already guessing at this point, it has some performance issues. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ct:comment/1,2&lt;/code&gt; in and on themselves are not really that bad; they take 100ms or so. But if you write tests like the following one…&lt;/p&gt;
  8879.  
  8880. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;nf&quot;&gt;failing_test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8881.  &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  8882.    &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8883.      &lt;span class=&quot;nn&quot;&gt;ct&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;comment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Dividing by &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; using div should not fail&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  8884.      &lt;span class=&quot;n&quot;&gt;infinity&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;div&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  8885.  
  8886.      &lt;span class=&quot;nn&quot;&gt;ct&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;comment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Dividing by &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; should not fail&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  8887.      &lt;span class=&quot;n&quot;&gt;infinity&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;I&lt;/span&gt;
  8888.    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;seq&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;
  8889.  
  8890.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;comment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8891.  
  8892. &lt;p&gt;In cases like the one above each successful run of the test uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ct:comment&lt;/code&gt; 200 times and that certainly adds up.&lt;/p&gt;
  8893.  
  8894. &lt;blockquote&gt;
  8895.  &lt;h5 id=&quot;lesson-learned-2&quot;&gt;&lt;strong&gt;Lesson Learned&lt;/strong&gt;&lt;/h5&gt;
  8896.  &lt;p&gt;If you’re going to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ct:comment/1,2&lt;/code&gt; make sure you’re not using it within a recursive function/loop.&lt;/p&gt;
  8897. &lt;/blockquote&gt;
  8898.  
  8899. &lt;h1 id=&quot;parallelization&quot;&gt;Parallelization&lt;/h1&gt;
  8900.  
  8901. &lt;p&gt;&lt;strong&gt;Common Test&lt;/strong&gt; lets you choose how to run your tests with &lt;a href=&quot;http://erlang.org/doc/man/common_test.html#Module:groups-0&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;groups/0&lt;/code&gt;&lt;/a&gt;. You can run tests multiple times, in sequential order, in parallel, etc. But sometimes, for instance with our &lt;strong&gt;black-box tests&lt;/strong&gt;, you can’t just &lt;em&gt;run all the tests in parallel&lt;/em&gt; to speed up the process as a whole. In our case, each of our tests is booting up a different instance of our server (with different configuration parameters) and hitting it with our client simulator to verify that it produces the responses we expect. In theory, we could run these tests in parallel, booting up as many servers as we need, etc. But in practice, that requires a lot of isolation-related effort that is just not worth it.&lt;/p&gt;
  8902.  
  8903. &lt;p&gt;Nevertheless, it’s worth considering that the parallelism that &lt;strong&gt;Common Test&lt;/strong&gt; gives you is certainly not the only one you can take advantage of. Check how our tests looked when we realized this:&lt;/p&gt;
  8904.  
  8905. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;nf&quot;&gt;init_per_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8906.  &lt;span class=&quot;nv&quot;&gt;Folder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;local_folder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8907.  &lt;span class=&quot;nf&quot;&gt;download_samples_from_s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Folder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8908.  &lt;span class=&quot;nf&quot;&gt;boot_up_server&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8909.  &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8910.  
  8911. &lt;span class=&quot;nf&quot;&gt;foo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8912.  &lt;span class=&quot;c&quot;&gt;% run client tests against the server
  8913. &lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  8914.  &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8915.  
  8916. &lt;span class=&quot;nf&quot;&gt;bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8917.  &lt;span class=&quot;c&quot;&gt;% run client tests against the server
  8918. &lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  8919.  &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8920.  
  8921. &lt;span class=&quot;nf&quot;&gt;end_per_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8922.  &lt;span class=&quot;nv&quot;&gt;Folder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;local_folder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8923.  &lt;span class=&quot;nf&quot;&gt;delete_folder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Folder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8924.  &lt;span class=&quot;nf&quot;&gt;tear_down_server&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8925.  &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8926.  
  8927. &lt;p&gt;As you can see, we were performing all these operations (i.e. download from s3, server start, test run, folder removal, server stop) sequentially for each testcase. Also, the file download from s3 was one of the most expensive operations in that suite. So, we refactored that code into something like this…&lt;/p&gt;
  8928.  
  8929. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;nf&quot;&gt;init_per_suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8930.  &lt;span class=&quot;nf&quot;&gt;download_all_samples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
  8931.  &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8932.  
  8933. &lt;span class=&quot;nf&quot;&gt;init_per_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8934.  &lt;span class=&quot;nf&quot;&gt;boot_up_server&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8935.  &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8936.  
  8937. &lt;span class=&quot;nf&quot;&gt;foo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8938.  &lt;span class=&quot;c&quot;&gt;% run client tests against the server
  8939. &lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  8940.  &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8941.  
  8942. &lt;span class=&quot;nf&quot;&gt;bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8943.  &lt;span class=&quot;c&quot;&gt;% run client tests against the server
  8944. &lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  8945.  &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8946.  
  8947. &lt;span class=&quot;nf&quot;&gt;end_per_testcase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8948.  &lt;span class=&quot;nf&quot;&gt;tear_down_server&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Exchange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  8949.  &lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  8950.  
  8951. &lt;span class=&quot;nf&quot;&gt;end_per_suite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  8952.  &lt;span class=&quot;nf&quot;&gt;delete_files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  8953.  
  8954. &lt;p&gt;And we made sure that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;download_all_samples/0&lt;/code&gt; downloaded as many files from s3 as it could in parallel, spawning several erlang processes and waiting for a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;done&lt;/code&gt; signal from them.&lt;/p&gt;
  8955.  
  8956. &lt;blockquote&gt;
  8957.  &lt;h5 id=&quot;lesson-learned-3&quot;&gt;&lt;strong&gt;Lesson Learned&lt;/strong&gt;&lt;/h5&gt;
  8958.  &lt;p&gt;You can use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;groups/0&lt;/code&gt; in your suites to parallelize tests, but even if you can’t parallelize whole test cases, you can still use pure Erlang to simultaneously run some parts of them.&lt;/p&gt;
  8959. &lt;/blockquote&gt;
  8960.  
  8961. &lt;h1 id=&quot;what-about-you&quot;&gt;What About You?&lt;/h1&gt;
  8962. &lt;p&gt;We found all these things in our constant effort to have quick tests and an agile development methodology. We’re sure we’re not alone here, so why don’t you let us know how &lt;strong&gt;you&lt;/strong&gt; speed up your tests in the comments below?&lt;/p&gt;
  8963.  
  8964. &lt;hr /&gt;
  8965.  
  8966. &lt;p&gt;&lt;strong&gt;Do you enjoy building high-quality large-scale systems? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  8967.  
  8968. </description>
  8969.    </item>
  8970.    
  8971.    
  8972.    
  8973.    <item>
  8974.      <title>
  8975. Quaff that potion: saving $millions with Elixir and Erlang
  8976. </title>
  8977.      <link>https://tech.nextroll.com/blog/dev/2018/01/08/quaff-that-potion-saving-millions-with-elixir-and-erlang.html</link>
  8978.      <pubDate>Mon, 08 Jan 2018 00:00:00 -0800</pubDate>
  8979.      <author></author>
  8980.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2018/01/08/quaff-that-potion-saving-millions-with-elixir-and-erlang</guid>
  8981.      <description>&lt;style type=&quot;text/css&quot;&gt;/*&lt;![CDATA[*/
  8982.  
  8983.  .highlight code.erlang { background-color: #315050; }
  8984.  .highlight code.erlang   .c { color: #FED633; }
  8985.  .highlight code.erlang   .n { color: #CBBA77; }
  8986.  .highlight code.erlang .err { color: #CBBA77; background-color: #315050; }
  8987.  .highlight code.erlang  .nl { color: yellow; }
  8988.  .highlight code.erlang  .nv { color: #10D010; }
  8989.  .highlight code.erlang   .o { color: #CBBA77; }
  8990.  .highlight code.erlang   .p { color: #CBBA77; }
  8991.  .highlight code.erlang   .s { color: #FD9226; }
  8992.  .highlight code.erlang  .ni { color: #18E7AE; }
  8993.  .highlight code.erlang  .nn { color: #CBBA77; }
  8994.  .highlight code.erlang  .kn { color: #4DFFFF; }
  8995.  .highlight code.erlang   .k { color: #4DFFFF; }
  8996.  .highlight code.erlang  .ow { color: #2DFFFE; }
  8997.  .highlight code.erlang  .nb { color: #29F89D; }
  8998.  .highlight code.erlang  .nf { color: #29F89D; }
  8999.  .highlight code.erlang  .bp { color: #FEA07E; }
  9000.  .highlight code.erlang  .mi { color: #CBBA77; }
  9001.  
  9002. /*]]&gt;*/&lt;/style&gt;
  9003.  
  9004. &lt;p&gt;We slashed our DynamoDB costs by over 75% using Kinesis, DynamoDB streams, and
  9005.  Erlang/OTP (and now Elixir) to implement a global cache warming system.  We present that
  9006.  system and two new open-source libraries for processing Kinesis and DynamoDB streams in
  9007.  a similar way using Elixir and Erlang.&lt;/p&gt;
  9008.  
  9009. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;15-20 minute read&lt;/code&gt;&lt;/p&gt;
  9010.  
  9011. &lt;hr /&gt;
  9012.  
  9013. &lt;p&gt;AdRoll uses &lt;a href=&quot;http://www.erlang.org/&quot;&gt;Erlang/OTP&lt;/a&gt; as the basis for several internal products, including a
  9014.  &lt;a href=&quot;/blog/web/2014/04/29/valentino-presents-adrolls-rtb-infrastructure.html&quot;&gt;real-time bidding platform&lt;/a&gt; running on &lt;a href=&quot;https://aws.amazon.com/ec2/&quot;&gt;Amazon EC2&lt;/a&gt;.
  9015.  Erlang/OTP is the king of robust highly-concurrent soft real-time systems such as these.&lt;/p&gt;
  9016.  
  9017. &lt;p&gt;This article describes how we substantially reduced the cost of an element of our
  9018.  real-time bidding platform (its DynamoDB usage) by implementing a global cache warmer
  9019.  using &lt;a href=&quot;https://aws.amazon.com/kinesis/&quot;&gt;Kinesis&lt;/a&gt; and &lt;a href=&quot;http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html&quot;&gt;DynamoDB streams&lt;/a&gt;, written in
  9020.  Erlang.  We later doubled the performance of that component by adapting it to use the
  9021.  &lt;a href=&quot;https://hexdocs.pm/flow/Flow.html&quot;&gt;Flow&lt;/a&gt; framework from the authors of &lt;a href=&quot;https://elixir-lang.org/&quot;&gt;Elixir&lt;/a&gt;.&lt;/p&gt;
  9022.  
  9023. &lt;p&gt;We also present two new open-source libraries for doing this kind of processing using
  9024.  Elixir and/or Erlang.&lt;/p&gt;
  9025.  
  9026. &lt;h1 id=&quot;background&quot;&gt;Background&lt;/h1&gt;
  9027.  
  9028. &lt;h2 id=&quot;real-time-bidding&quot;&gt;Real-time bidding&lt;/h2&gt;
  9029.  
  9030. &lt;p&gt;When a publisher (such as a website or mobile app author) wants to monetize its
  9031.  inventory, it can sell ad space to buyers.  This can be done using a variety of means;
  9032.  one of these is a programmatic auction conducted on behalf of the publisher by an
  9033.  advertising exchange (clearinghouse) with which the publisher is integrated, such as
  9034.  Google’s &lt;a href=&quot;https://www.doubleclickbygoogle.com/&quot;&gt;DoubleClick Ad Exchange&lt;/a&gt; or &lt;a href=&quot;https://www.taboola.com/&quot;&gt;Taboola&lt;/a&gt;.&lt;/p&gt;
  9035.  
  9036. &lt;p&gt;For most inventory, these programmatic auctions have multiple interested buyers.  Much
  9037.  of the time, each buyer is notified by each exchange with which it is integrated
  9038.  whenever an opportunity to buy ad space occurs.  AdRoll currently participates in well
  9039.  over one million programmatic auctions per second for a substantial part of each day.&lt;/p&gt;
  9040.  
  9041. &lt;p&gt;In addition to supporting this high request volume, buyers must also respond quickly
  9042.  (typically in less than 100ms, hence “real-time”): these auctions may be occurring while
  9043.  a resource (web page, video, etc) is in the process of loading.  A response may include
  9044.  a bid price, which is the maximum price that buyer is willing to pay for that particular
  9045.  ad space shown to a specific user.  Barring any special circumstances, the highest bidder
  9046.  wins the auction and may display an ad in the space which was won, paying an amount
  9047.  determined by the dynamics of the auction (often &lt;a href=&quot;https://en.wikipedia.org/wiki/Vickrey_auction&quot;&gt;second-price&lt;/a&gt;).&lt;/p&gt;
  9048.  
  9049. &lt;h2 id=&quot;ad-targeting&quot;&gt;Ad targeting&lt;/h2&gt;
  9050.  
  9051. &lt;p&gt;In order to bid in an auction for ad space, a buyer needs to be able to determine how
  9052.  much it values a particular impression.  AdRoll is able to value impressions for
  9053.  specific users by combining &lt;a href=&quot;/blog/data-science/2015/12/08/data-science_event_processing.html&quot;&gt;large-scale machine
  9054.  learning&lt;/a&gt; with the keeping of profiles for each of
  9055.  the billions of users (generally: cookies) we know about.&lt;/p&gt;
  9056.  
  9057. &lt;p&gt;For web and mobile app inventory, we maintain for a period of time a set of targetable
  9058.  “segments” for each user who has visited an AdRoll customer’s site and who has not opted
  9059.  out of tracking.  A segment could be something like “has visited more than N pages of
  9060.  the site”, “has placed an item in the shopping cart”, “has looked at brown backpacks”,
  9061.  “has purchased an item”, and so on.  A user’s profile is the set of these segments along
  9062.  with timestamps and certain other information &lt;a href=&quot;/blog/ops/2013/10/02/dynamodb-replication.html&quot;&gt;stored in
  9063.  DynamoDB&lt;/a&gt;.&lt;/p&gt;
  9064.  
  9065. &lt;p&gt;The &lt;a href=&quot;https://help.adroll.com/hc/en-us/articles/212939358&quot;&gt;retargeting&lt;/a&gt; product allows an AdRoll customer to show
  9066.  ads to users who have previously visited their site, along with other restrictions.  For
  9067.  example, a customer could show a “10% off brown backpacks” promotion on Thursdays and
  9068.  Fridays between 3 and 6 PM Pacific time to users in San Francisco who previously added
  9069.  them to a shopping cart but did not purchase, plus a general “5% off” promotion to
  9070.  similar users anywhere in California but excluding San Francisco.&lt;/p&gt;
  9071.  
  9072. &lt;p&gt;AdRoll’s &lt;a href=&quot;https://help.adroll.com/hc/en-us/articles/213429627&quot;&gt;upper-funnel&lt;/a&gt; product allows customers to show ads
  9073.  to new “act-alike” users–users who are likely to act in similar ways as existing
  9074.  desirable (converting) users, but who would not otherwise be reachable via
  9075.  retargeting–similarly across the web and on social media such as Facebook.&lt;/p&gt;
  9076.  
  9077. &lt;p&gt;When an opportunity to buy ad space presents itself, associated with that opportunity is
  9078.  an identifier for the user who would see our customer’s ad if we won the auction.  Using
  9079.  that identifier, we can look up the user’s profile data and find the ads which are
  9080.  relevant, and then with advanced machine learning models price in real time a set of
  9081.  possible bids using those ads.&lt;/p&gt;
  9082.  
  9083. &lt;h2 id=&quot;profile-data-system-overview&quot;&gt;Profile data system overview&lt;/h2&gt;
  9084.  
  9085. &lt;p&gt;As described above, DynamoDB is the effective source of truth for our advertising
  9086.  profile data; this works very well.&lt;/p&gt;
  9087.  
  9088. &lt;p&gt;Unfortunately, it’s also very expensive: in order to meet our low-latency requirement,
  9089.  we need to replicate our profile data globally in each AWS region where we operate (and
  9090.  due to some design constraints we could not limit replication to a subset of profile
  9091.  data).  Under this scheme our DynamoDB costs, dominated by write capacity costs, are
  9092.  multiplied by the number of regions we use.&lt;/p&gt;
  9093.  
  9094. &lt;h3 id=&quot;original-design&quot;&gt;Original design&lt;/h3&gt;
  9095.  
  9096. &lt;p&gt;To access this profile data from our bidding system we initially implemented a
  9097.  straightforward setup like the following, replicated globally.  From our application’s
  9098.  perspective, this works like a read-through cache:&lt;/p&gt;
  9099.  
  9100. &lt;p&gt;&lt;img src=&quot;/images/post_images/quaff_elixir/before.png&quot; alt=&quot;historical system overview image; image made with draw.io&quot; /&gt;&lt;/p&gt;
  9101.  
  9102. &lt;p&gt;In this diagram, boxes on the left represent user profile updates which occur constantly
  9103.  throughout the day.  The updates are written to various DynamoDB tables in every region
  9104.  by separate data processing pipelines.  When a bidder instance seeks to obtain a user’s
  9105.  profile, it checks its local caches before reading from DynamoDB.  If recent cached data
  9106.  is found, it is used; otherwise, the instance reads from DynamoDB and then caches the
  9107.  data for next time.&lt;/p&gt;
  9108.  
  9109. &lt;p&gt;As described below, we eventually created a similar system allowing the removal of most
  9110.  usage of DynamoDB from all but one region, cutting our DynamoDB costs by a very large
  9111.  degree.&lt;/p&gt;
  9112.  
  9113. &lt;h3 id=&quot;new-design&quot;&gt;New design&lt;/h3&gt;
  9114.  
  9115. &lt;p&gt;While DynamoDB is our bidding system’s source of truth for profile data, we found that
  9116.  we don’t actually need to read it most of the time: most data for users of interest can
  9117.  be made to reside in cache by increasing the cached item TTL.  Of course, doing that
  9118.  alone would lead to stale data and inaccurate user targeting and bid pricing.  If we
  9119.  could have long-lived cache data for most users which is also kept up to date, we
  9120.  wouldn’t need to consult DynamoDB most of the time.&lt;/p&gt;
  9121.  
  9122. &lt;p&gt;We devised a solution to support just that: a set of large(r) caches exists in each
  9123.  region, and each region constantly receives updates–up to 500,000/sec depending on
  9124.  time-of-day–made on a set of master tables in a single region.  Kinesis and DynamoDB
  9125.  streams were the natural choices for this workload:&lt;/p&gt;
  9126.  
  9127. &lt;h4 id=&quot;dynamodb-stream-aggregator&quot;&gt;DynamoDB stream aggregator&lt;/h4&gt;
  9128.  
  9129. &lt;p&gt;&lt;img src=&quot;/images/post_images/quaff_elixir/aggregator.png&quot; alt=&quot;DynamoDB stream aggregator image&quot; /&gt;&lt;/p&gt;
  9130.  
  9131. &lt;p&gt;Here, a data-agnostic aggregator component observes updates as they occur on the set of
  9132.  relevant DynamoDB streams.  It batches these updates together and replicates them to
  9133.  Kinesis streams in all target regions which formerly had the same DynamoDB tables as the
  9134.  master region.&lt;/p&gt;
  9135.  
  9136. &lt;h4 id=&quot;kinesis-stream-processor&quot;&gt;Kinesis stream processor&lt;/h4&gt;
  9137.  
  9138. &lt;p&gt;&lt;img src=&quot;/images/post_images/quaff_elixir/writer.png&quot; alt=&quot;Cache writer image&quot; /&gt;&lt;/p&gt;
  9139.  
  9140. &lt;p&gt;In each target region operates a Kinesis stream reader (and cache writer) component
  9141.  which updates long-lived items in a set of caches.  Both it and our real-time bidding
  9142.  system fall back to reading from tables in the master region (a relatively slow but rare
  9143.  operation) when data is missing from the cache or is incomplete, writing it back to the
  9144.  local cache using a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CAS&lt;/code&gt; operation to ensure consistency.&lt;/p&gt;
  9145.  
  9146. &lt;p&gt;This allowed us to keep the same conceptual design for the multiple involved systems
  9147.  while still realizing substantial savings: we can tolerate a slower source of truth if
  9148.  most data is accurately cached and long-lived, which is now the case.&lt;/p&gt;
  9149.  
  9150. &lt;h1 id=&quot;implementation&quot;&gt;Implementation&lt;/h1&gt;
  9151.  
  9152. &lt;p&gt;Chief among our concerns in designing this system were correctness (if we have bad data,
  9153.  we can’t make good bidding decisions), robustness (if the system is down, we’ll have bad
  9154.  or missing data), and scalability (we need to handle an arbitrary volume of data).
  9155.  Secondarily, speed was also a consideration.  After a successful (but not very scalable)
  9156.  proof-of-concept which used Python, we elected to implement the complete solution using
  9157.  Erlang, building on our expertise in this area while also being a good fit in general
  9158.  for this type of problem.&lt;/p&gt;
  9159.  
  9160. &lt;h2 id=&quot;on-scalability&quot;&gt;On scalability&lt;/h2&gt;
  9161.  
  9162. &lt;p&gt;The essence of scalability is being able to efficiently apply a repeatable formula to an
  9163.  arbitrary load.  In the realm of software systems, Erlang/OTP makes this easy: with a
  9164.  set of robust building blocks (OTP), systems comprised of shared-almost-nothing
  9165.  sequential blocking processes can easily be created–and more importantly their behavior
  9166.  can be easily understood.  Process independence enables arbitrary horizontal scaling.
  9167.  Contrast for example with a deferred/promises or asynchronous callbacks model more
  9168.  common in other environments with more cumbersome concurrency, or shared-state threads
  9169.  which bring their own issues.&lt;/p&gt;
  9170.  
  9171. &lt;h2 id=&quot;on-robustness&quot;&gt;On robustness&lt;/h2&gt;
  9172.  
  9173. &lt;p&gt;Erlang/OTP promotes robust system designs.  “Let it crash” and supervision trees makes
  9174.  for simple-to-understand control flow and process behavior.  A process is always part of
  9175.  a greater whole, and will “do stuff until it can’t”, then crash &amp;amp; burn knowing it’s part
  9176.  of a system which was designed from the ground up with appropriate failure modes.  If
  9177.  appropriate, such failing processes may be restarted by a supervisor somewhere in the
  9178.  system to try again.&lt;/p&gt;
  9179.  
  9180. &lt;h2 id=&quot;on-correctness&quot;&gt;On correctness&lt;/h2&gt;
  9181.  
  9182. &lt;p&gt;Erlang’s syntax and design promotes the succinct and functional expression of programs
  9183.  and avoids unnecessary defensive programming.  Less code written means less code to be
  9184.  read and understood later, and fewer places for bugs to hide.&lt;/p&gt;
  9185.  
  9186. &lt;h2 id=&quot;but-enough-about-erlang-enterjava&quot;&gt;But enough about Erlang; enter…Java?&lt;/h2&gt;
  9187.  
  9188. &lt;p&gt;To ensure the correct ordered processing of data and abstract away the handling of
  9189.  details like work distribution and instance failure, we elected to use the existing
  9190.  Java-based &lt;a href=&quot;https://github.com/awslabs/amazon-kinesis-client&quot;&gt;Kinesis Client Library&lt;/a&gt; provided by Amazon.  The KCL provides a
  9191.  JSON-based &lt;a href=&quot;https://github.com/awslabs/amazon-kinesis-client/blob/73ac2c0e25a25776cbc88f2c685223fb049e6757/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java&quot;&gt;language-agnostic interface&lt;/a&gt; (MultiLangDaemon) which allows
  9192.  Kinesis processing (and DynamoDB streams processing using an adapter) to be done in any
  9193.  language and environment without writing any Java code.&lt;/p&gt;
  9194.  
  9195. &lt;p&gt;Unfortunately, the KCL/MultiLangDaemon expects to be the driver of any resulting
  9196.  processing system: it emits events to which the processor reacts and launches one
  9197.  processor executable per owned stream shard to effect that processing.  While this makes
  9198.  for very simple processing code, we need to efficiently process up to thousands of
  9199.  shards across multiple streams, and launching thousands of BEAM nodes to do that would
  9200.  be wasteful (and partly defeat the purpose of using Erlang).  We also wanted to manage
  9201.  the KCL processes themselves.&lt;/p&gt;
  9202.  
  9203. &lt;h3 id=&quot;erlang-multilangdaemon-interface&quot;&gt;Erlang MultiLangDaemon interface&lt;/h3&gt;
  9204.  
  9205. &lt;p&gt;Erlang systems are pretty good at being network servers, so we turned this problem
  9206.  around by creating an adapter which launches a MultiLangDaemon whose worker subprocesses
  9207.  become lightweight network clients of our Erlang node:&lt;/p&gt;
  9208.  
  9209. &lt;p&gt;&lt;img src=&quot;/images/post_images/quaff_elixir/beam-mld.png&quot; alt=&quot;BEAM-MLD concept image&quot; /&gt;&lt;/p&gt;
  9210.  
  9211. &lt;p&gt;For each stream being processed, our Erlang node launches a MultiLangDaemon subprocess
  9212.  as a port program.  Each such subprocess is configured to launch a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;netcat&lt;/code&gt;-like program
  9213.  (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;socat&lt;/code&gt;) as a shard processing program, which simply maps &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stdin&lt;/code&gt;/&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stdout&lt;/code&gt; back to the
  9214.  port on which the node is listening for that MLD’s stream.&lt;/p&gt;
  9215.  
  9216. &lt;p&gt;Thus, each owned shard is mapped to an independent Erlang process (a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gen_statem&lt;/code&gt; state
  9217.  machine) in a single node, and each such process is only concerned with handling a
  9218.  single stream record at a time (and deciding when to periodically checkpoint).  The
  9219.  result looks like this:&lt;/p&gt;
  9220.  
  9221. &lt;p&gt;&lt;img src=&quot;/images/post_images/quaff_elixir/erlang-mld-workers.png&quot; alt=&quot;erlang-based processing diagram&quot; /&gt;&lt;/p&gt;
  9222.  
  9223. &lt;p&gt;This gives most of the benefits of using the KCL and results in a processing application
  9224.  which is similar to how Java-based KCL processing applications work (multiple threads of
  9225.  execution within a single VM), but without requiring the user to write any Java code.
  9226.  It allows for very simple shard processors like this minimal example:&lt;/p&gt;
  9227.  
  9228. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;noisy_worker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  9229.  
  9230. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;behavior&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;erlmld_worker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  9231.  
  9232. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;export&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initialize&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9233.         &lt;span class=&quot;n&quot;&gt;ready&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9234.         &lt;span class=&quot;n&quot;&gt;process_record&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9235.         &lt;span class=&quot;n&quot;&gt;shutdown&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  9236.  
  9237. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;include_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;erlmld/include/erlmld.hrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  9238.  
  9239. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}).&lt;/span&gt;
  9240.  
  9241. &lt;span class=&quot;nf&quot;&gt;initialize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Opaque&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ISN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9242.    &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  9243.    &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; initialized for shard &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; at &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9244.              &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ISN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  9245.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;
  9246.  
  9247. &lt;span class=&quot;nf&quot;&gt;ready&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9248.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;
  9249.  
  9250. &lt;span class=&quot;nf&quot;&gt;process_record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9251.               &lt;span class=&quot;nl&quot;&gt;#stream_record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sequence_number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;SN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9252.    &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; (&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;) got record &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  9253.    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;of&lt;/span&gt;
  9254.        &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9255.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  9256.                 &lt;span class=&quot;nl&quot;&gt;#checkpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sequence_number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;SN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}};&lt;/span&gt;
  9257.        &lt;span class=&quot;n&quot;&gt;false&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9258.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;
  9259.    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
  9260.  
  9261. &lt;span class=&quot;nf&quot;&gt;shutdown&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Reason&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9262.    &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; (&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;) shutting down, reason: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9263.              &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Reason&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  9264.    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Reason&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;of&lt;/span&gt;
  9265.        &lt;span class=&quot;n&quot;&gt;terminate&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9266.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nl&quot;&gt;#checkpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{}};&lt;/span&gt;
  9267.        &lt;span class=&quot;p&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9268.            &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;
  9269.    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  9270.  
  9271. &lt;h3 id=&quot;new-open-source-library-erlmld&quot;&gt;New open-source library: erlmld&lt;/h3&gt;
  9272.  
  9273. &lt;p&gt;We created an &lt;a href=&quot;https://github.com/AdRoll/erlmld&quot;&gt;open-source Erlang library for Kinesis and DynamoDB streams processing
  9274.  (erlmld)&lt;/a&gt; using MultiLangDaemon.&lt;/p&gt;
  9275.  
  9276. &lt;p&gt;It also includes support for compressed KPL-style record aggregation and an additional
  9277.  behavior which turned out to be a common pattern in our own usage of the library:
  9278.  accumulating batches of records and flushing them periodically or when “full”.  This
  9279.  allows even simpler shard processor definitions:&lt;/p&gt;
  9280.  
  9281. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;noisy_flusher&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  9282.  
  9283. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;behavior&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;erlmld_flusher&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  9284.  
  9285. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;export&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9286.         &lt;span class=&quot;n&quot;&gt;add_record&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9287.         &lt;span class=&quot;n&quot;&gt;flush&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  9288.  
  9289. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;include_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;erlmld/include/erlmld.hrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  9290.  
  9291. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]}).&lt;/span&gt;
  9292.  
  9293. &lt;span class=&quot;nf&quot;&gt;init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Opaque&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9294.  &lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;
  9295.  
  9296. &lt;span class=&quot;nf&quot;&gt;add_record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9297.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;full&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  9298. &lt;span class=&quot;nf&quot;&gt;add_record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9299.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}}.&lt;/span&gt;
  9300.  
  9301. &lt;span class=&quot;nf&quot;&gt;flush&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shard_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Kind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9302.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ProcessedTokens&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Records&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;unzip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  9303.  &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; processing batch: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Records&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  9304.  &lt;span class=&quot;nn&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sleep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  9305.  &lt;span class=&quot;nv&quot;&gt;NState&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]},&lt;/span&gt;
  9306.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;NState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ProcessedTokens&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  9307.  
  9308. &lt;p&gt;Here, a processor builds a batch of records to be processed.  A backpressure mechanism
  9309.  exists to ensure records are supplied only as fast as the processor can handle: if a
  9310.  batch is full, the processor is instructed to flush (i.e., perform a batch of work),
  9311.  which can take an arbitrary amount of time.&lt;/p&gt;
  9312.  
  9313. &lt;p&gt;After flushing, the processor returns the opaque tokens corresponding to each
  9314.  successfully-processed record, which allows the system to periodically checkpoint only
  9315.  up to the highest contiguous completed record on the stream.  If a worker dies or a new
  9316.  one comes online and steals a lease, some work might be repeated but none will be lost.&lt;/p&gt;
  9317.  
  9318. &lt;h2 id=&quot;beyond-good-and-erlang&quot;&gt;Beyond Good and Erlang&lt;/h2&gt;
  9319.  
  9320. &lt;p&gt;The arrangement described above is scalable and works well, but the
  9321.  process-per-owned-shard model means our concurrency is limited by the number of stream
  9322.  shards.  If we have a processing system with 16 cores and we’re processing a stream with
  9323.  8 shards, at least half of our computing resources are wasted unless we take special
  9324.  measures which make our simple processing code be not so simple.&lt;/p&gt;
  9325.  
  9326. &lt;p&gt;For our use case, it turns out we can scale our processing independently of stream
  9327.  shards while preserving approximate (ordered by cookie) in-order processing and simple
  9328.  processing code, which is especially helpful for I/O-bound tasks like ours.&lt;/p&gt;
  9329.  
  9330. &lt;h3 id=&quot;new-open-source-library-exmld&quot;&gt;New open-source library: exmld&lt;/h3&gt;
  9331.  
  9332. &lt;p&gt;We created an &lt;a href=&quot;https://github.com/AdRoll/exmld&quot;&gt;open-source Elixir library for Kinesis and DynamoDB streams processing
  9333.  (exmld)&lt;/a&gt; which builds on our Erlang library &lt;a href=&quot;https://github.com/AdRoll/erlmld&quot;&gt;erlmld&lt;/a&gt;.  It
  9334.  can be used in both Elixir and Erlang-based processing systems.&lt;/p&gt;
  9335.  
  9336. &lt;p&gt;It makes use of the &lt;a href=&quot;https://hexdocs.pm/flow/Flow.html&quot;&gt;Flow&lt;/a&gt; framework to set up a MapReduce-style processing
  9337.  pipeline inside BEAM: instead of having a set of worker processes which each handle a
  9338.  single shard, data from all shards owned by the current node feeds into a pipeline which
  9339.  sends data downstream to a set of reducers of arbitrary number.  Each reducer
  9340.  consistently handles a partition of the key space (i.e., record keys are hashed and
  9341.  distributed among reducers, and the same reducer will always receive records having the
  9342.  same key in approximate order).&lt;/p&gt;
  9343.  
  9344. &lt;p&gt;Here’s what we had before with the Erlang-based system:&lt;/p&gt;
  9345.  
  9346. &lt;p&gt;&lt;img src=&quot;/images/post_images/quaff_elixir/erlang-mld-workers.png&quot; alt=&quot;erlang-based processing diagram&quot; /&gt;&lt;/p&gt;
  9347.  
  9348. &lt;p&gt;And here’s what we now have with the Elixir Flow-based processing pipeline:&lt;/p&gt;
  9349.  
  9350. &lt;p&gt;&lt;img src=&quot;/images/post_images/quaff_elixir/elixir-mld-pipeline.png&quot; alt=&quot;elixir flow-based pipeline diagram&quot; /&gt;&lt;/p&gt;
  9351.  
  9352. &lt;p&gt;Here, using the Flow framework, our shard workers no longer directly process records.
  9353.  Instead, they feed records into a set of mappers which extract zero or more items to be
  9354.  processed from each.  Each of these items has an associated key which is used to
  9355.  distribute them among a set of reducers, which do the actual processing work.  As work
  9356.  is completed, the workers which originated each record are notified so they can
  9357.  checkpoint and make progress on the shard they own.&lt;/p&gt;
  9358.  
  9359. &lt;p&gt;This configuration allows re-use of any existing processing code with minor
  9360.  modifications, but with greatly increased concurrency which is unrelated to the number
  9361.  of stream shards owned by the current node (we can use an arbitrary number of mappers
  9362.  and reducers).  It helps us because we have an I/O bound task where processing can occur
  9363.  independently for each cookie which is seen; ordering of events is generally preserved
  9364.  for single cookies within a single worker (but not between different cookies).&lt;/p&gt;
  9365.  
  9366. &lt;h4 id=&quot;elixirerlang-sample-processing-code&quot;&gt;Elixir+Erlang sample processing code&lt;/h4&gt;
  9367.  
  9368. &lt;p&gt;Here’s a complete example of using Erlang processing code with this new Elixir library.
  9369.  The “disposition” concept is used to keep track of the status of event processing to
  9370.  support stream checkpointing and provide backpressure; see &lt;a href=&quot;https://github.com/AdRoll/exmld&quot;&gt;exmld&lt;/a&gt; docs
  9371.  for details.  In a real application, an actual supervision tree would be used.  The same
  9372.  organization of code would also apply to an Elixir processing application using this
  9373.  library.&lt;/p&gt;
  9374.  
  9375. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exmld_example&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  9376.  
  9377. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;export&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  9378.  
  9379. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}).&lt;/span&gt;
  9380.  
  9381. &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;ni&quot;&gt;include_lib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;erlmld/include/erlmld.hrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
  9382.  
  9383. &lt;span class=&quot;c&quot;&gt;%% this will result in calls to process_event/2 for items extracted from
  9384. %% records seen on `StreamName` in `StreamRegion`.  run
  9385. %% erlmld/priv/download.sh before running this to download needed JARs.
  9386. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9387.  &lt;span class=&quot;nv&quot;&gt;StreamRegion&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;us-east-1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9388.  &lt;span class=&quot;c&quot;&gt;%% this affects kcl state table name:
  9389. &lt;/span&gt;  &lt;span class=&quot;nv&quot;&gt;AppName&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;your-erlang-kcl-application&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9390.  &lt;span class=&quot;nv&quot;&gt;StreamName&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;your-stream-name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9391.  
  9392.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Stage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;&apos;Elixir.Exmld.KinesisStage&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;start_link&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([]),&lt;/span&gt;
  9393.  
  9394.  &lt;span class=&quot;nv&quot;&gt;FlowSpec&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stages&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Stage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
  9395.               &lt;span class=&quot;n&quot;&gt;extract_items_fn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;extract_items&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9396.               &lt;span class=&quot;c&quot;&gt;%% use first extracted item element as a reducer
  9397. &lt;/span&gt;               &lt;span class=&quot;c&quot;&gt;%% partition key:
  9398. &lt;/span&gt;               &lt;span class=&quot;n&quot;&gt;partition_key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;elem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  9399.               &lt;span class=&quot;n&quot;&gt;state0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9400.               &lt;span class=&quot;n&quot;&gt;process_fn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;process_event&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9401.               &lt;span class=&quot;n&quot;&gt;flow_opts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]},&lt;/span&gt;
  9402.  
  9403.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;FlowWorker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;&apos;Elixir.Exmld&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;start_link&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;FlowSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  9404.  
  9405.  &lt;span class=&quot;nv&quot;&gt;ProducerConfig&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record_processor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;erlmld_batch_processor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9406.                     &lt;span class=&quot;n&quot;&gt;record_processor_data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  9407.                         &lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flusher_mod&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;&apos;Elixir.Exmld.KinesisWorker&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9408.                           &lt;span class=&quot;n&quot;&gt;flusher_mod_data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Stage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}],&lt;/span&gt;
  9409.                           &lt;span class=&quot;n&quot;&gt;flush_interval_ms&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9410.                           &lt;span class=&quot;n&quot;&gt;checkpoint_interval_ms&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9411.                           &lt;span class=&quot;n&quot;&gt;watchdog_timeout_ms&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;600000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9412.                           &lt;span class=&quot;n&quot;&gt;description&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;StreamName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;StreamRegion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  9413.                           &lt;span class=&quot;n&quot;&gt;on_checkpoint&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;on_checkpoint&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  9414.                     &lt;span class=&quot;n&quot;&gt;kcl_appname&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;AppName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9415.                     &lt;span class=&quot;n&quot;&gt;stream_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;StreamName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9416.                     &lt;span class=&quot;n&quot;&gt;stream_region&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;StreamRegion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9417.                     &lt;span class=&quot;n&quot;&gt;stream_type&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kinesis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9418.                     &lt;span class=&quot;c&quot;&gt;%% these would normally be set via app env:
  9419. &lt;/span&gt;                     &lt;span class=&quot;n&quot;&gt;listen_ip&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;loopback&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9420.                     &lt;span class=&quot;n&quot;&gt;listen_port&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9421.                     &lt;span class=&quot;n&quot;&gt;app_suffix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;undefined&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9422.                     &lt;span class=&quot;n&quot;&gt;initial_position&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;LATEST&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9423.                     &lt;span class=&quot;n&quot;&gt;idle_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9424.                     &lt;span class=&quot;n&quot;&gt;metrics_level&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;SUMMARY&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9425.                     &lt;span class=&quot;n&quot;&gt;metrics_dimensions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Operation&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9426.                     &lt;span class=&quot;n&quot;&gt;failover_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9427.                     &lt;span class=&quot;n&quot;&gt;ignore_unexpected_child_shards&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9428.                     &lt;span class=&quot;n&quot;&gt;worker_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;undefined&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9429.                     &lt;span class=&quot;n&quot;&gt;max_records&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9430.                     &lt;span class=&quot;n&quot;&gt;max_lease_theft&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9431.                     &lt;span class=&quot;n&quot;&gt;shard_sync_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  9432.  
  9433.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ProducerSup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;erlmld_sup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;start_link&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ProducerConfig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  9434.  
  9435.  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Stage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;FlowWorker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ProducerSup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}.&lt;/span&gt;
  9436.  
  9437. &lt;span class=&quot;nf&quot;&gt;on_checkpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Description&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9438.  &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; checkpointed on &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Description&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ShardId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;
  9439.  
  9440. &lt;span class=&quot;c&quot;&gt;%% this function should normally return a list of items to be processed
  9441. %% by process_event/2.  if records are using the KPL-style aggregation
  9442. %% supported by erlmld, this can just return single-element lists.  due
  9443. %% to the definition of &apos;partition_key&apos; above, this should return
  9444. %% 2-tuples where the first element is a partition key to use in
  9445. %% distributing work among reducers.  this example just uses the
  9446. %% record&apos;s partition key.
  9447. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;extract_items&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;&apos;__struct__&apos;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;&apos;Elixir.Exmld.KinesisStage.Event&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9448.                &lt;span class=&quot;n&quot;&gt;stage&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Stage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9449.                &lt;span class=&quot;n&quot;&gt;worker&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Worker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9450.                &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
  9451.                  &lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;&apos;__struct__&apos;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;&apos;Elixir.Exmld.KinesisWorker.Datum&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9452.                    &lt;span class=&quot;n&quot;&gt;stream_record&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;R&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}})&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9453.  &lt;span class=&quot;nv&quot;&gt;Value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;R&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;of&lt;/span&gt;
  9454.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;heartbeat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9455.              &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;heartbeat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  9456.            &lt;span class=&quot;nl&quot;&gt;#stream_record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;partition_key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;K&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9457.              &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;K&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Stage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Worker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;R&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;
  9458.          &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9459.  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;
  9460.  
  9461. &lt;span class=&quot;c&quot;&gt;%% this function is used as a Flow reducer.  it accepts an extracted
  9462. %% item and a reducer state, and returns an updated state after doing
  9463. %% any needed processing.
  9464. &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;process_event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9465.  &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Value&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;of&lt;/span&gt;
  9466.      &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Stage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Worker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nl&quot;&gt;#stream_record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sequence_number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;SN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9467.        &lt;span class=&quot;c&quot;&gt;%% notify upstream stage of disposition of processing so the
  9468. &lt;/span&gt;        &lt;span class=&quot;c&quot;&gt;%% originating worker can make progress and checkpoint.  in a
  9469. &lt;/span&gt;        &lt;span class=&quot;c&quot;&gt;%% real application this would be done in batches after work is
  9470. &lt;/span&gt;        &lt;span class=&quot;c&quot;&gt;%% completed, and the status would vary based on processing
  9471. &lt;/span&gt;        &lt;span class=&quot;c&quot;&gt;%% outcome.
  9472. &lt;/span&gt;        &lt;span class=&quot;nv&quot;&gt;WorkerMap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
  9473.          &lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Worker&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt;
  9474.            &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;&apos;__struct__&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;&apos;Elixir.Exmld.KinesisWorker.Disposition&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9475.               &lt;span class=&quot;n&quot;&gt;sequence_number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;SN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  9476.               &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
  9477.        &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;&apos;Elixir.Exmld.KinesisStage&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;disposition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Stage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;WorkerMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  9478.        &lt;span class=&quot;c&quot;&gt;%% do some work and return updated state:
  9479. &lt;/span&gt;        &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; processing &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; (seen: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  9480.        &lt;span class=&quot;nn&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sleep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  9481.        &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;#state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;C&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  9482.  
  9483.      &lt;span class=&quot;n&quot;&gt;heartbeat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  9484.        &lt;span class=&quot;c&quot;&gt;%% heartbeats are used to prevent stalls when a worker is
  9485. &lt;/span&gt;        &lt;span class=&quot;c&quot;&gt;%% waiting for processing to happen and dispositions to be
  9486. &lt;/span&gt;        &lt;span class=&quot;c&quot;&gt;%% returned.
  9487. &lt;/span&gt;        &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~p&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; ignoring heartbeat event&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;~n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()]),&lt;/span&gt;
  9488.        &lt;span class=&quot;nv&quot;&gt;State&lt;/span&gt;
  9489.    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  9490.  
  9491. &lt;h3 id=&quot;impact&quot;&gt;Impact&lt;/h3&gt;
  9492.  
  9493. &lt;p&gt;Using this new arrangement with our existing Erlang code, we doubled the performance of
  9494.  our system.  Here’s a snapshot of what happened to our stream delay metric after we
  9495.  deployed an Elixir Flow-using version of our application following a service
  9496.  degradation:&lt;/p&gt;
  9497.  
  9498. &lt;p&gt;&lt;img src=&quot;/images/post_images/quaff_elixir/elixir-improvement.png&quot; alt=&quot;Erlang-&amp;gt;Elixir improvement image&quot; /&gt;&lt;/p&gt;
  9499.  
  9500. &lt;p&gt;Here, the Erlang-only system had fallen behind and entered an unhealthy “far behind tip
  9501.  of stream” state.  It had scaled up to the maximum size and was recovering, but not very
  9502.  quickly.  The Elixir-using version was deployed to gauge performance during recovery;
  9503.  stream processing delay fell off a cliff, and this Elixir-using version has been
  9504.  running since.&lt;/p&gt;
  9505.  
  9506. &lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
  9507.  
  9508. &lt;p&gt;Using Erlang and now Elixir, we were able to implement a scalable system which cut our
  9509.  DynamoDB costs by over 75%.  Using some Elixir glue code[2] enabling use of the
  9510.  &lt;a href=&quot;https://hexdocs.pm/flow/Flow.html&quot;&gt;Flow&lt;/a&gt; framework, we doubled the performance of our previously Erlang-only
  9511.  system with only minor modifications.  We also open-sourced[1][2] the code we’re using
  9512.  to process high-volume Kinesis and DynamoDB streams data in Erlang and Elixir.&lt;/p&gt;
  9513.  
  9514. &lt;p&gt;I enjoyed working on this system, and Elixir is now firmly in our toolbox for use in the
  9515.  future!&lt;/p&gt;
  9516.  
  9517. &lt;p&gt;[1]: &lt;a href=&quot;https://github.com/AdRoll/erlmld&quot;&gt;https://github.com/AdRoll/erlmld&lt;/a&gt; &lt;br /&gt;
  9518. [2]: &lt;a href=&quot;https://github.com/AdRoll/exmld&quot;&gt;https://github.com/AdRoll/exmld&lt;/a&gt;&lt;/p&gt;
  9519.  
  9520. &lt;p&gt;&lt;strong&gt;Do you enjoy working with large-scale systems? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  9521.  
  9522. </description>
  9523.    </item>
  9524.    
  9525.    
  9526.    
  9527.    <item>
  9528.      <title>How to Run a Front-End Infrastructure Team</title>
  9529.      <link>https://tech.nextroll.com/blog/frontend/2017/08/29/how-to-run-a-front-end-infrastructure-team.html</link>
  9530.      <pubDate>Tue, 29 Aug 2017 00:00:00 -0700</pubDate>
  9531.      <author></author>
  9532.      <guid isPermaLink="false">https://tech.nextroll.com/blog/frontend/2017/08/29/how-to-run-a-front-end-infrastructure-team</guid>
  9533.      <description>&lt;p&gt;Over the past years, AdRoll has grown from a humble startup built around a
  9534. single feature to a global marketing platform with a &lt;a href=&quot;https://www.adroll.com/product&quot;&gt;diverse suite of
  9535. products&lt;/a&gt;. Along with the growth of the company,
  9536. we have put a lot of work into building a solid infrastructure for user
  9537. interface development. In this post, we talk about the human aspects of
  9538. front-end projects that are shared between multiple engineering teams.&lt;/p&gt;
  9539.  
  9540. &lt;figure class=&quot;full&quot;&gt;
  9541.  &lt;img src=&quot;/images/post_images/frontend-infra-layers.jpg&quot; alt=&quot;A screenshot of an AdRoll dashboard with analytics, i18n, CSS and UI components visualized as layers&quot; /&gt;
  9542.  &lt;figcaption&gt;Every web application at AdRoll relies on multiple layers of front-end infrastructure&lt;/figcaption&gt;
  9543. &lt;/figure&gt;
  9544.  
  9545. &lt;p&gt;Our current front-end infrastructure consists of &lt;a href=&quot;/blog/frontend/2015/11/05/rollup-shared-ui-components.html&quot;&gt;UI
  9546. components&lt;/a&gt;, a &lt;a href=&quot;https://medium.com/adroll-design/the-journey-to-design-consistency-ee26bef2fd88&quot;&gt;UX
  9547. pattern
  9548. library&lt;/a&gt;
  9549. and various JavaScript packages for internationalization (i18n), analytics and
  9550. A/B testing. We consider all these projects “internally open sourced”. This
  9551. means we encourage our engineers to use contributions to these projects as a way
  9552. of sharing their knowledge across team boundaries. Ultimately, we believe
  9553. &lt;a href=&quot;https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar&quot;&gt;leveraging a wider range of ideas makes software more
  9554. robust&lt;/a&gt;.&lt;/p&gt;
  9555.  
  9556. &lt;p&gt;If one of our vertical product teams wants to propose a change to the
  9557. infrastructure, they are expected to gather feedback from other teams that might
  9558. be affected by it. In the simplest case, this could mean making a style in a UI
  9559. component customizable through a property. In more advanced cases, this could
  9560. mean building an entirely new shared library (such as an API client) to abstract
  9561. out complexity from other engineering teams.&lt;/p&gt;
  9562.  
  9563. &lt;p&gt;While this model allows each team to build the exact solution they need, it’s
  9564. also important to have people whose sole purpose is to think about how
  9565. everything fits together. Without proper ownership and oversight, shared
  9566. infrastructure tends to become a complex mess of patches that feels more like a
  9567. bottleneck than a helpful tool.&lt;/p&gt;
  9568.  
  9569. &lt;h2 id=&quot;the-frontend-core-team&quot;&gt;The Frontend Core Team&lt;/h2&gt;
  9570.  
  9571. &lt;p&gt;Since 2015 each new product and internal tool we have built has been implemented
  9572. as a &lt;a href=&quot;https://en.wikipedia.org/wiki/Microservices&quot;&gt;microservice&lt;/a&gt;. This loosely
  9573. coupled architecture has let us iterate on new functionality faster but it has
  9574. also fragmented our approach to UI development. Initially, a group of senior
  9575. engineers was able to build a foundation for our current front-end
  9576. infrastructure, but it was still unclear who should act as the first point of
  9577. contact for prioritizing tasks and triaging issues.&lt;/p&gt;
  9578.  
  9579. &lt;p&gt;In 2016 we formalized our approach to maintaining front-end infrastructure by
  9580. founding a team we call Frontend Core. Today, this team works closely with our
  9581. front-end engineers, UI designers and product leaders to make sure our web
  9582. applications are built consistently across all teams. Most importantly, the
  9583. Frontend Core team oversees and maintains all of our collaborative front-end
  9584. projects.&lt;/p&gt;
  9585.  
  9586. &lt;p&gt;The rest of this post covers the UI development guidelines set forth by our
  9587. Frontend Core team in five sections. Each section explains our general approach
  9588. to the topic and provides a list of practical tips for adopting similar
  9589. practices in other organizations:&lt;/p&gt;
  9590.  
  9591. &lt;ul&gt;
  9592.  &lt;li&gt;&lt;a href=&quot;#remove-bottlenecks-for-iteration&quot;&gt;Remove Bottlenecks for Iteration&lt;/a&gt;&lt;/li&gt;
  9593.  &lt;li&gt;&lt;a href=&quot;#communicate-across-team-boundaries&quot;&gt;Communicate Across Team Boundaries&lt;/a&gt;&lt;/li&gt;
  9594.  &lt;li&gt;&lt;a href=&quot;#look-for-common-problems-and-solutions&quot;&gt;Look for Common Problems and Solutions&lt;/a&gt;&lt;/li&gt;
  9595.  &lt;li&gt;&lt;a href=&quot;#lower-the-barrier-to-success&quot;&gt;Lower the Barrier to Success&lt;/a&gt;&lt;/li&gt;
  9596.  &lt;li&gt;&lt;a href=&quot;#keep-on-experimenting&quot;&gt;Keep on Experimenting&lt;/a&gt;&lt;/li&gt;
  9597. &lt;/ul&gt;
  9598.  
  9599. &lt;h2 id=&quot;remove-bottlenecks-for-iteration&quot;&gt;Remove Bottlenecks for Iteration&lt;/h2&gt;
  9600.  
  9601. &lt;p&gt;As infrastructure projects become more complex and more teams depend on them, it
  9602. becomes harder to make changes without breaking existing functionality. With
  9603. this in mind, we are constantly streamlining the development workflow around our
  9604. front-end infrastructure. Our ultimate goal is to make our shared projects a joy
  9605. to work with so that more people will adopt them and keep making improvements.&lt;/p&gt;
  9606.  
  9607. &lt;p&gt;Since the beginning of 2016, we have had 47 internal contributors make changes
  9608. to our front-end infrastructure, resulting in over 500 pull requests. Each
  9609. change has gone through code review and a suite of automated tests to prevent
  9610. unintended side-effects. Our Frontend Core team keeps a close eye on open pull
  9611. requests and makes sure they get reviewed as quickly as possible.&lt;/p&gt;
  9612.  
  9613. &lt;p&gt;To measure improvements in the process, we also gather statistics on pull
  9614. request lifetime over time. The chart below shows a typical bimodal distribution
  9615. for code reviews: Most changes take about a week to review but there’s a subset
  9616. of quick changes (such as production patches) that are deployed within an hour
  9617. of opening the pull request.&lt;/p&gt;
  9618.  
  9619. &lt;figure&gt;
  9620.  &lt;img src=&quot;/images/post_images/frontend-infra-pr-lifetime.png&quot; alt=&quot;A bar chart of pull request lifetime&quot; /&gt;
  9621.  &lt;figcaption&gt;To get charts like this for your GitHub project, we recommend checking out &lt;a href=&quot;https://github.com/pgjones/push-pull&quot; class=&quot;no-wrap&quot;&gt;this open source tool&lt;/a&gt;&lt;/figcaption&gt;
  9622. &lt;/figure&gt;
  9623.  
  9624. &lt;p&gt;Here’s how you can remove common bottlenecks in your UI development process:&lt;/p&gt;
  9625.  
  9626. &lt;ul&gt;
  9627.  &lt;li&gt;&lt;strong&gt;Make simple changes fast and complex ones possible.&lt;/strong&gt; Production issues can
  9628. often be mitigated with UI changes so make sure patches can be reviewed and
  9629. released as quickly as possible. Encourage talking through large changes
  9630. before the actual programming work.&lt;/li&gt;
  9631.  &lt;li&gt;&lt;strong&gt;Consider bundling similar JavaScript packages in a monorepo.&lt;/strong&gt; A single code
  9632. repository allows packages to share build tools and test scripts with less
  9633. overhead compared to completely separate projects. See
  9634. &lt;a href=&quot;https://github.com/babel/babel/blob/19c4dd2d8c99b909b02ccca27de7b0b32eb416ea/doc/design/monorepo.md&quot;&gt;Babel&lt;/a&gt;
  9635. for a popular example and &lt;a href=&quot;https://github.com/lerna/lerna&quot;&gt;Lerna&lt;/a&gt; for a
  9636. pre-existing toolkit.&lt;/li&gt;
  9637.  &lt;li&gt;&lt;strong&gt;Use &lt;a href=&quot;https://help.github.com/articles/creating-a-pull-request-template-for-your-repository/&quot;&gt;pull request
  9638. templates&lt;/a&gt;
  9639. and &lt;a href=&quot;https://help.github.com/articles/about-codeowners/&quot;&gt;CODEOWNERS files&lt;/a&gt; for
  9640. GitHub projects.&lt;/strong&gt; PR templates should include a checklist for commonly
  9641. overlooked aspects of UI development such as reusability, translatability and
  9642. style consistency. Use code reviews as a way of transferring knowledge between
  9643. UI teams.&lt;/li&gt;
  9644.  &lt;li&gt;&lt;strong&gt;Set up staging deployments for feature branches.&lt;/strong&gt; Having a live demo means
  9645. the reviewer doesn’t always have to run the code locally.
  9646. &lt;a href=&quot;https://storybook.js.org/&quot;&gt;Storybook&lt;/a&gt; is an excellent tool for building demos
  9647. for React components.&lt;/li&gt;
  9648.  &lt;li&gt;&lt;strong&gt;Don’t over-optimize.&lt;/strong&gt; Wondering how much time you should spend improving
  9649. common tasks? &lt;a href=&quot;https://xkcd.com/1205/&quot;&gt;XKCD has an answer&lt;/a&gt;.&lt;/li&gt;
  9650. &lt;/ul&gt;
  9651.  
  9652. &lt;h2 id=&quot;communicate-across-team-boundaries&quot;&gt;Communicate Across Team Boundaries&lt;/h2&gt;
  9653.  
  9654. &lt;p&gt;Internal open source projects rely heavily on communication for announcing new
  9655. features and breaking changes. One of the most important roles for an
  9656. infrastructure team is to share context and act as a relay between contributors.&lt;/p&gt;
  9657.  
  9658. &lt;p&gt;At AdRoll, the Frontend Core team holds a meeting every two weeks with engineers
  9659. and designers from all of our product teams. In the meeting, teams share updates
  9660. on any major front-end changes they are working on, even if they don’t directly
  9661. affect our infrastructure. A recurring meeting like this has proven to be very
  9662. useful for identifying surprising connections between teams.&lt;/p&gt;
  9663.  
  9664. &lt;p&gt;When a team mentions they are working on a new feature, another team might
  9665. mention that they already have a solution for it. In other cases, we have
  9666. identified pain points in the infrastructure that everyone feels but nobody has
  9667. tackled yet. These discussions often result in new tickets in our Frontend Core
  9668. backlog or updates to the development process.&lt;/p&gt;
  9669.  
  9670. &lt;p&gt;Conversations around front-end infrastructure can be challenging because UI
  9671. development involves so many different specialties. Designers, engineers and
  9672. product managers often use different words for the same things depending on
  9673. their background (e.g. “dropdown” vs. “pull-down menu” vs. “select input”). It
  9674. can also be difficult for a single developer to get a high level overview of all
  9675. the resources available to them (e.g. global color variables and CSS classes).&lt;/p&gt;
  9676.  
  9677. &lt;p&gt;To establish a common conceptual language and help with discoverability, we
  9678. built a &lt;a href=&quot;http://ux.adroll.com/&quot;&gt;UX pattern library&lt;/a&gt; and released it as a
  9679. resource both inside and outside the company. To complement the pattern library,
  9680. we also have an internal site that lists all of our UI components and lets
  9681. developers compare their functionality over multiple versions (see video below).
  9682. These sites allow everyone in the company to get a complete picture of our
  9683. front-end infrastructure and use explicit URLs when referring to specific
  9684. UI elements.&lt;/p&gt;
  9685.  
  9686. &lt;figure&gt;
  9687.  &lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/enZuX4RuSB8&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;
  9688.  &lt;figcaption&gt;Our internal site for demoing shared UI components, built using the &lt;a href=&quot;https://github.com/awslabs/aws-js-s3-explorer&quot; class=&quot;no-wrap&quot;&gt;AWS JavaScript S3 Explorer&lt;/a&gt;&lt;/figcaption&gt;
  9689. &lt;/figure&gt;
  9690.  
  9691. &lt;p&gt;Here are some ways you can improve communication around infrastructure:&lt;/p&gt;
  9692.  
  9693. &lt;ul&gt;
  9694.  &lt;li&gt;&lt;strong&gt;Set up a dedicated discussion channel for front-end infrastructure.&lt;/strong&gt; Public
  9695. Slack channels or email lists work well for this purpose. If necessary, set up
  9696. a separate support channel for urgent issues.&lt;/li&gt;
  9697.  &lt;li&gt;&lt;strong&gt;Communicate visually whenever possible.&lt;/strong&gt; Screenshots, screen captures and
  9698. live demos are worth a thousand chat messages.&lt;/li&gt;
  9699.  &lt;li&gt;&lt;strong&gt;Use &lt;a href=&quot;http://semver.org/&quot;&gt;Semantic Versioning&lt;/a&gt; when publishing shared
  9700. packages.&lt;/strong&gt; Proper versioning helps build trust between engineering teams.
  9701. Thinking in terms of breaking changes, new features and bug fixes is a helpful
  9702. framework for updates that affect user-facing functionality.&lt;/li&gt;
  9703.  &lt;li&gt;&lt;strong&gt;Help teams through breaking changes.&lt;/strong&gt; In an internal open source project,
  9704. you have a direct line of communication to every user. Do your best to make
  9705. breaking changes less painful by supporting the teams working on upgrading
  9706. components.&lt;/li&gt;
  9707.  &lt;li&gt;&lt;strong&gt;Be open to feedback and suggestions.&lt;/strong&gt; The infrastructure is your product
  9708. and engineers are your users. Encourage new engineers to question technical
  9709. choices to surface new ideas.&lt;/li&gt;
  9710. &lt;/ul&gt;
  9711.  
  9712. &lt;h2 id=&quot;look-for-common-problems-and-solutions&quot;&gt;Look for Common Problems and Solutions&lt;/h2&gt;
  9713.  
  9714. &lt;p&gt;Even with open and active channels of communication, pain points aren’t always
  9715. surfaced in an explicit way. When one of your coworkers faces an issue, they
  9716. might not be aware that it’s a common problem for others. A good maintainer of
  9717. front-end infrastructure is able to distill the chatter around UI development
  9718. into clear problem statements and work on solutions for the most common ones.&lt;/p&gt;
  9719.  
  9720. &lt;p&gt;Here are some typical messages and their implications from an infrastructure
  9721. point of view:&lt;/p&gt;
  9722.  
  9723. &lt;blockquote&gt;
  9724.  &lt;ul&gt;
  9725.    &lt;li&gt;&lt;em&gt;I wish we had…&lt;/em&gt;&lt;/li&gt;
  9726.    &lt;li&gt;&lt;em&gt;Since we don’t have X, we can’t…&lt;/em&gt;&lt;/li&gt;
  9727.    &lt;li&gt;&lt;em&gt;I’ve been meaning to build…&lt;/em&gt;&lt;/li&gt;
  9728.    &lt;li&gt;&lt;em&gt;Do we have support for X yet?&lt;/em&gt;&lt;/li&gt;
  9729.  &lt;/ul&gt;
  9730.  
  9731.  &lt;p&gt;Messages like these can be a sign that the infrastructure doesn’t support some
  9732. use cases. Reach out to the person or team to understand the context and see
  9733. if they could become a contributor.&lt;/p&gt;
  9734.  
  9735.  &lt;ul&gt;
  9736.    &lt;li&gt;&lt;em&gt;I’m seeing this error when…&lt;/em&gt;&lt;/li&gt;
  9737.    &lt;li&gt;&lt;em&gt;How do you run…&lt;/em&gt;&lt;/li&gt;
  9738.    &lt;li&gt;&lt;em&gt;I couldn’t figure out X&lt;/em&gt;&lt;/li&gt;
  9739.    &lt;li&gt;&lt;em&gt;Do you know what I’m doing wrong?&lt;/em&gt;&lt;/li&gt;
  9740.  &lt;/ul&gt;
  9741.  
  9742.  &lt;p&gt;When people mention errors or ask for help, it can either be a real bug or a
  9743. sign that the project isn’t documented well enough. Help the person resolve
  9744. the issue and see how you can update the instructions in your project
  9745. afterwards.&lt;/p&gt;
  9746.  
  9747.  &lt;ul&gt;
  9748.    &lt;li&gt;&lt;em&gt;It was taking forever so I…&lt;/em&gt;&lt;/li&gt;
  9749.    &lt;li&gt;&lt;em&gt;X is too simple so I had to…&lt;/em&gt;&lt;/li&gt;
  9750.    &lt;li&gt;&lt;em&gt;I couldn’t use that option so…&lt;/em&gt;&lt;/li&gt;
  9751.    &lt;li&gt;&lt;em&gt;This is hacky but…&lt;/em&gt;&lt;/li&gt;
  9752.  &lt;/ul&gt;
  9753.  
  9754.  &lt;p&gt;Software engineers take pride in working around issues. Take note when people
  9755. mention hacky solutions and improve the infrastructure accordingly. Official
  9756. solutions are easier to maintain over time than application specific hacks.&lt;/p&gt;
  9757.  
  9758.  &lt;ul&gt;
  9759.    &lt;li&gt;&lt;em&gt;Oh, I forgot to run the tests again&lt;/em&gt;&lt;/li&gt;
  9760.    &lt;li&gt;&lt;em&gt;Joe, could you deploy this?&lt;/em&gt;&lt;/li&gt;
  9761.    &lt;li&gt;&lt;em&gt;I’m waiting for Ted to…&lt;/em&gt;&lt;/li&gt;
  9762.    &lt;li&gt;&lt;em&gt;Do you think that’s ready to merge?&lt;/em&gt;&lt;/li&gt;
  9763.  &lt;/ul&gt;
  9764.  
  9765.  &lt;p&gt;Procedural discussion around internal open source projects can be a sign that
  9766. the development process isn’t as streamlined as it could be. Consider
  9767. automating tests and train more people to act as maintainers.&lt;/p&gt;
  9768. &lt;/blockquote&gt;
  9769.  
  9770. &lt;p&gt;Over time, maintainers should automate the day-to-day tasks of their projects as
  9771. much as they can so they can focus on long term improvements. The tasks that
  9772. can’t be automated should be well documented so they are easily repeatable. For
  9773. example, when reviewing changes to our shared UI components, we always look out
  9774. for changes in global namespaces – CSS classes, global
  9775. &lt;a href=&quot;http://sass-lang.com/&quot;&gt;SASS&lt;/a&gt; variables and &lt;a href=&quot;http://redux.js.org/&quot;&gt;Redux&lt;/a&gt;
  9776. actions.&lt;/p&gt;
  9777.  
  9778. &lt;p&gt;Documentation and guidance can be automated as well: Instead of referring people
  9779. to static documents like wiki pages, design a development workflow that gives
  9780. users enough instructions so they can solve problems on their own. &lt;a href=&quot;https://facebook.github.io/react/blog/2016/07/11/introducing-reacts-error-code-system.html&quot;&gt;React’s
  9781. error code
  9782. system&lt;/a&gt;
  9783. is a great example of self-documenting software: The library itself warns
  9784. developers if they are doing something wrong and points them to a helpful
  9785. documentation page.&lt;/p&gt;
  9786.  
  9787. &lt;p&gt;One of the most effective ways of keeping your infrastructure stable is
  9788. abstracting out parts of the codebase that most contributors don’t need to
  9789. modify. This means thinking of your development workflow a series of &lt;a href=&quot;https://en.wikipedia.org/wiki/Black_box&quot;&gt;black
  9790. boxes&lt;/a&gt;. Just like you don’t typically
  9791. modify the source code of your code editor or web browser, the contributors in
  9792. infrastructure projects shouldn’t need to modify the build process or the test
  9793. harness.&lt;/p&gt;
  9794.  
  9795. &lt;p&gt;For us, one such improvement was automating the publish process for our shared
  9796. UI components. Previously publishing a new version of a component required
  9797. multiple manual steps and only a few senior engineers had the credentials
  9798. required to run them. The results also varied slightly depending on each
  9799. publisher’s local environment. Once we moved the entire process to a
  9800. &lt;a href=&quot;https://jenkins.io/\&quot;&gt;Jenkins&lt;/a&gt; job, we could publish a component just by
  9801. picking its name from a list and entering a version number. This has allowed
  9802. more contributors to publish changes on their own and reduced the workload on
  9803. our Frontend Core team.&lt;/p&gt;
  9804.  
  9805. &lt;figure&gt;
  9806.  &lt;img src=&quot;/images/post_images/frontend-infra-jenkins-publish.png&quot; alt=&quot;A screenshot from Slack showing notifications from Jenkins and GitHub&quot; /&gt;
  9807.  &lt;figcaption&gt;Slack notifications from the Jenkins job that publishes our shared UI components&lt;/figcaption&gt;
  9808. &lt;/figure&gt;
  9809.  
  9810. &lt;p&gt;Here are some of our tips for building self-documenting projects:&lt;/p&gt;
  9811.  
  9812. &lt;ul&gt;
  9813.  &lt;li&gt;&lt;strong&gt;Use type checking to enforce assumptions in your code.&lt;/strong&gt; Our shared UI
  9814. components make heavy use of &lt;a href=&quot;https://facebook.github.io/react/docs/typechecking-with-proptypes.html&quot;&gt;React
  9815. PropTypes&lt;/a&gt;.
  9816. We’re also experimenting with &lt;a href=&quot;https://www.typescriptlang.org/&quot;&gt;TypeScript&lt;/a&gt;.&lt;/li&gt;
  9817.  &lt;li&gt;&lt;strong&gt;Test and lint your code extensively.&lt;/strong&gt; Make sure tests and linters are run
  9818. automatically against every proposed change. Linters like
  9819. &lt;a href=&quot;https://eslint.org/&quot;&gt;ESLint&lt;/a&gt; and
  9820. &lt;a href=&quot;https://github.com/brigade/scss-lint&quot;&gt;scss-lint&lt;/a&gt; can even help improve the
  9821. runtime performance of your applications.&lt;/li&gt;
  9822.  &lt;li&gt;&lt;strong&gt;Log instructions and next steps in command-line tools.&lt;/strong&gt; For example, when
  9823. starting a development server, make sure the local URL is shown in the console
  9824. output.&lt;/li&gt;
  9825.  &lt;li&gt;&lt;strong&gt;Send notifications from build systems.&lt;/strong&gt; Contributors should be notified if
  9826. they break a build or a deployment. Maintainers should be cc’d so they can
  9827. spot recurring issues.&lt;/li&gt;
  9828.  &lt;li&gt;&lt;strong&gt;Write actionable error messages.&lt;/strong&gt; Don’t expect everyone to be familiar with
  9829. the latest front-end tools. Error pages in internal services should give users
  9830. ways to debug the issue.&lt;/li&gt;
  9831. &lt;/ul&gt;
  9832.  
  9833. &lt;h2 id=&quot;lower-the-barrier-to-success&quot;&gt;Lower the Barrier to Success&lt;/h2&gt;
  9834.  
  9835. &lt;p&gt;Success in UI development doesn’t just mean giving end users access to new
  9836. functionality, it means that new features have to seamlessly blend in with
  9837. everything else in the product. Bad practices in UI development directly result
  9838. in a bad user experience. This puts a lot of pressure on front-end engineers and
  9839. forces them to think of shared infrastructure earlier in a company’s lifetime
  9840. compared to their back-end oriented colleagues.&lt;/p&gt;
  9841.  
  9842. &lt;p&gt;When an engineer builds something on top of existing infrastructure, they should
  9843. feel like they’re standing on the shoulders of giants – like they can achieve
  9844. something they didn’t know they could. Front-end infrastructure should enable
  9845. every engineer to build beautiful, user-friendly features that can be tested,
  9846. monitored and deployed easily.&lt;/p&gt;
  9847.  
  9848. &lt;p&gt;In order to remain useful, infrastructure has to grow with the company. In our
  9849. experience, the best way to guarantee that a project survives over time is to
  9850. focus on developer engagement. This is why we believe the open source process is
  9851. such a good match with software infrastructure: When people know they had a part
  9852. in building a project, they are more likely to care about it (also known as the
  9853. &lt;a href=&quot;https://en.wikipedia.org/wiki/IKEA_effect&quot;&gt;IKEA effect&lt;/a&gt;).&lt;/p&gt;
  9854.  
  9855. &lt;figure&gt;
  9856.  &lt;img src=&quot;/images/post_images/frontend-infra-ikea-instructions.jpg&quot; alt=&quot;An excerpt of IKEA&apos;s assembly instructions&quot; /&gt;
  9857.  &lt;figcaption&gt;IKEA helps people succeed with their famous assembly instructions&lt;/figcaption&gt;
  9858. &lt;/figure&gt;
  9859.  
  9860. &lt;p&gt;Here’s how you can make developers engaged in collaborative projects:&lt;/p&gt;
  9861.  
  9862. &lt;ul&gt;
  9863.  &lt;li&gt;&lt;strong&gt;Be transparent about decision making.&lt;/strong&gt; Your fellow engineers should know
  9864. why something is built the way it is and why you chose a certain technology
  9865. over another.&lt;/li&gt;
  9866.  &lt;li&gt;&lt;strong&gt;Help people learn more about the infrastructure and how it is used.&lt;/strong&gt;
  9867. Encourage people to read design documents and source code. Open up access to
  9868. usage metrics and site analytics.&lt;/li&gt;
  9869.  &lt;li&gt;&lt;strong&gt;Make sure maintainers follow the same guidelines as everyone else.&lt;/strong&gt; Nobody
  9870. should be above code reviews or testing guidelines. When exceptions have to be
  9871. made, be open about the reasons behind them.&lt;/li&gt;
  9872.  &lt;li&gt;&lt;strong&gt;Highlight achievements on a personal level.&lt;/strong&gt; Celebrate wins from engineers
  9873. who make their first front-end contributions. Let people show their own work
  9874. in meetings through presentations and screencasts.&lt;/li&gt;
  9875.  &lt;li&gt;&lt;strong&gt;Let others lead and take responsibility.&lt;/strong&gt; Having contributors you can trust
  9876. is necessary for scaling the development process. Delegate the ownership of
  9877. specialized UI components to the teams that built them.&lt;/li&gt;
  9878. &lt;/ul&gt;
  9879.  
  9880. &lt;h2 id=&quot;keep-on-experimenting&quot;&gt;Keep on Experimenting&lt;/h2&gt;
  9881.  
  9882. &lt;p&gt;Almost everything we have learned about front-end infrastructure has been a
  9883. result of controlled experimentation. We’re constantly trying out new promising
  9884. technologies and improvements in our development process. Whenever we identify
  9885. different approaches to the same problem, we try out the most promising ones and
  9886. review the results afterwards.&lt;/p&gt;
  9887.  
  9888. &lt;p&gt;It is good to remember that even the best infrastructure comes with some
  9889. overhead. Every piece of shared code will eventually break or become outdated.
  9890. An important part of the experimentation mindset is questioning the choices you
  9891. have made and knowing when standardization is not the right approach.&lt;/p&gt;
  9892.  
  9893. &lt;figure&gt;
  9894.  &lt;img src=&quot;/images/post_images/frontend-infra-xkcd-1319-adapted.png&quot; alt=&quot;XKCD comic #1319 with &apos;automation&apos; changed to &apos;infrastructure&apos;&quot; /&gt;
  9895.  &lt;figcaption&gt;Adapted from &lt;a href=&quot;https://xkcd.com/1319/&quot;&gt;XKCD #1319 &quot;Automation&quot;&lt;/a&gt;&lt;/figcaption&gt;
  9896. &lt;/figure&gt;
  9897.  
  9898. &lt;p&gt;Some of our ongoing experiments in front-end infrastructure are:&lt;/p&gt;
  9899.  
  9900. &lt;ul&gt;
  9901.  &lt;li&gt;Replacing Browserify and Gulp with Webpack as a standard front-end build
  9902. process&lt;/li&gt;
  9903.  &lt;li&gt;Integrating &lt;a href=&quot;http://redux.js.org/&quot;&gt;Redux&lt;/a&gt; actions into our shared React
  9904. components&lt;/li&gt;
  9905.  &lt;li&gt;Extending our &lt;a href=&quot;http://ux.adroll.com/&quot;&gt;public style guide&lt;/a&gt;&lt;/li&gt;
  9906.  &lt;li&gt;Unifying usage analytics, exception tracking and A/B testing using a custom
  9907. JavaScript library&lt;/li&gt;
  9908.  &lt;li&gt;Standardizing integration testing with Enzyme (as
  9909. &lt;a href=&quot;https://speakerdeck.com/jessicagrist/redux-apps-with-enzyme&quot;&gt;presented&lt;/a&gt; at
  9910. the ReactJS SF Meetup on July 13th)&lt;/li&gt;
  9911.  &lt;li&gt;Building a &lt;a href=&quot;http://yeoman.io/&quot;&gt;Yeoman&lt;/a&gt; generator for bootstrapping web
  9912. applications&lt;/li&gt;
  9913.  &lt;li&gt;Using &lt;a href=&quot;https://storybook.js.org/&quot;&gt;Storybook&lt;/a&gt; for the live demos in our shared
  9914. UI component library&lt;/li&gt;
  9915. &lt;/ul&gt;
  9916.  
  9917. &lt;hr /&gt;
  9918.  
  9919. &lt;p&gt;Thanks for reading! We hope this post is helpful for people who are looking to
  9920. improve their UI development processes. If you would like to hear more about a
  9921. specific topic or share your experiences on maintaining front-end
  9922. infrastructure, don’t hesitate to post a comment below.&lt;/p&gt;
  9923.  
  9924. &lt;p&gt;PS: &lt;a href=&quot;https://www.adroll.com/about/careers&quot;&gt;We are hiring&lt;/a&gt; both remote and local
  9925. engineers!&lt;/p&gt;
  9926. </description>
  9927.    </item>
  9928.    
  9929.    
  9930.    
  9931.    <item>
  9932.      <title>TrailDB 0.6 Released</title>
  9933.      <link>https://tech.nextroll.com/blog/data/2017/05/15/traildb-0.6.html</link>
  9934.      <pubDate>Mon, 15 May 2017 00:00:00 -0700</pubDate>
  9935.      <author></author>
  9936.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2017/05/15/traildb-0.6</guid>
  9937.      <description>&lt;p&gt;Today, almost a year since &lt;a href=&quot;http://tech.adroll.com/blog/data/2016/05/24/traildb-open-sourced.html&quot;&gt;the initial TrailDB open-source release&lt;/a&gt;,
  9938. we are happy to announce the next major version of TrailDB, 0.6. The
  9939. release is long overdue: this release is packed with major new features
  9940. (and some minor bug fixes) that have been driven both by increasing
  9941. internal use of TrailDB at AdRoll, as well as requests and contributions
  9942. by the community.&lt;/p&gt;
  9943.  
  9944. &lt;p&gt;You can download and install the latest version by following
  9945. the &lt;a href=&quot;http://traildb.io/docs/getting_started/&quot;&gt;getting started&lt;/a&gt; guide.&lt;/p&gt;
  9946.  
  9947. &lt;h3 id=&quot;traildb-in-action&quot;&gt;TrailDB in Action&lt;/h3&gt;
  9948.  
  9949. &lt;p&gt;First, let’s start with some highlights about the usage of TrailDB.
  9950. At AdRoll, the amount of data stored in TrailDBs has been growing
  9951. exponentially. It took two years to reach the first trillion events
  9952. stored. Now we are routinely creating new TrailDBs storing trillions
  9953. of events every week. Besides the scale, TrailDBs are powering even
  9954. more critical services at AdRoll, including various machine learning
  9955. pipelines, large-scale data analysis, and our new &lt;a href=&quot;https://www.adroll.com/resources/guides-and-reports/past-present-future-of-attribution&quot;&gt;attribution engine&lt;/a&gt;.&lt;/p&gt;
  9956.  
  9957. &lt;p&gt;We want TrailDB to be a small, battle-hardened piece of
  9958. software which works with different languages and environments without
  9959. too much trouble. We have been preaching this gospel at a number of
  9960. events. For instance, see our presentation at &lt;a href=&quot;https://www.youtube.com/watch?v=qK43uLbnFtg&quot;&gt;an SF Data Mining meetup&lt;/a&gt; or
  9961. &lt;a href=&quot;https://www.youtube.com/watch?v=-oPFxSwn0lM&quot;&gt;a TrailDB tutorial at the PyData SF 2016 conference&lt;/a&gt;:&lt;/p&gt;
  9962.  
  9963. &lt;iframe src=&quot;//slides.com/villetuulos/traildb-tutorial-pydata-2016/embed&quot; width=&quot;576&quot; height=&quot;420&quot; scrolling=&quot;no&quot; frameborder=&quot;0&quot; webkitallowfullscreen=&quot;&quot; mozallowfullscreen=&quot;&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;
  9964.  
  9965. &lt;p&gt;Most important, we are excited to see TrailDB gaining traction
  9966. outside AdRoll. Thank you for all the raised issues, suggestions, and
  9967. contributions, such as the nascent language bindings for &lt;a href=&quot;https://github.com/poynt/traildb-node&quot;&gt;NodeJS&lt;/a&gt;. Recently, we adopted community-contributed &lt;a href=&quot;https://github.com/traildb/traildb-rust&quot;&gt;Rust bindings&lt;/a&gt;
  9968. under the official TrailDB organization in GitHub. If you have any
  9969. feedback or ideas about TrailDB, get in touch on &lt;a href=&quot;https://gitter.im/traildb/traildb&quot;&gt;the TrailDB Gitter channel&lt;/a&gt;.&lt;/p&gt;
  9970.  
  9971. &lt;h3 id=&quot;major-new-features&quot;&gt;Major New Features&lt;/h3&gt;
  9972.  
  9973. &lt;p&gt;You can see the full list of changes since the 0.5 release in &lt;a href=&quot;https://github.com/traildb/traildb/blob/master/CHANGELOG.md&quot;&gt;the
  9974. changelog&lt;/a&gt;. Here are some highlights:&lt;/p&gt;
  9975.  
  9976. &lt;h4 id=&quot;trck---a-highly-optimized-query-language-for-traildb&quot;&gt;Trck - a highly optimized query language for TrailDB&lt;/h4&gt;
  9977.  
  9978. &lt;p&gt;When we made the initial open-source release, we hinted, “We have
  9979. a number of tools built on top of TrailDB which make computing various
  9980. user-level metrics easier. We are planning to open-source some of
  9981. these tools in the future.” This finally happened in March, when we
  9982. open-sourced &lt;a href=&quot;http://tech.adroll.com/blog/data-science/2017/03/21/trck.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt;, a highly optimized query language for TrailDB&lt;/a&gt;.&lt;/p&gt;
  9983.  
  9984. &lt;p&gt;In brief, if you have ever had a need to run queries on user-level funnels
  9985. like “how many users have first done A, then B within T seconds”, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt; is
  9986. the query language for you. Almost all
  9987. systems powered by TrailDB at AdRoll use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt; in one form or another.
  9988. It is hard to overstate its usefulness when your data is shaped like
  9989. TrailDB. To learn more You should read a separate &lt;a href=&quot;http://tech.adroll.com/blog/data-science/2017/03/21/trck.html&quot;&gt;blog post about
  9990. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt;&lt;/a&gt;
  9991. and its &lt;a href=&quot;https://github.com/traildb/trck/blob/master/README.md&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
  9992.  
  9993. &lt;h4 id=&quot;filters-views-and-multi-cursors&quot;&gt;Filters, Views, and Multi-Cursors&lt;/h4&gt;
  9994.  
  9995. &lt;p&gt;Most new features in this release are related to creating &lt;a href=&quot;http://traildb.io/docs/technical_overview/#return-a-subset-of-events-with-event-filters&quot;&gt;event
  9996. filters&lt;/a&gt;
  9997. that select a subset of features or trails from TrailDB. You
  9998. can filter events by defining &lt;a href=&quot;http://traildb.io/docs/api/#filter-events&quot;&gt;a boolean expression over fields&lt;/a&gt;,
  9999. including &lt;a href=&quot;http://traildb.io/docs/api/#tdb_event_filter_add_time_range&quot;&gt;timestamps&lt;/a&gt;. See &lt;a href=&quot;http://slides.com/villetuulos/traildb-tutorial-pydata-2016#/47&quot;&gt;the PyData presentation for examples of filters&lt;/a&gt;.&lt;/p&gt;
  10000.  
  10001. &lt;p&gt;You can set filters to cover the whole TrailDB with &lt;a href=&quot;http://traildb.io/docs/api/#tdb_set_opt&quot;&gt;tdb_set_opt&lt;/a&gt; which, in effect, creates &lt;a href=&quot;http://traildb.io/docs/technical_overview/#return-a-subset-of-events-with-event-filters&quot;&gt;a view over the TrailDB&lt;/a&gt;
  10002. that can be &lt;a href=&quot;http://traildb.io/docs/technical_overview/#create-traildb-extracts-materialized-views&quot;&gt;materialized&lt;/a&gt;. You can also set the filters to cover only
  10003. individual trails, allowing fine-grained &lt;a href=&quot;http://traildb.io/docs/technical_overview/#whitelist-or-blacklist-trails-a-view-over-a-subset-of-trails&quot;&gt;whitelisting and blacklisting
  10004. of trails&lt;/a&gt;. Or you can &lt;a href=&quot;http://traildb.io/docs/api/#tdb_cursor_set_event_filter&quot;&gt;attach filters only to an individual cursor&lt;/a&gt;.&lt;/p&gt;
  10005.  
  10006. &lt;p&gt;&lt;a href=&quot;http://traildb.io/docs/api/#join-trails-with-multi-cursors&quot;&gt;Multi-cursors&lt;/a&gt; allow iterating over separate trails (users) as if they
  10007. were a single trail. Multi-cursors even work across multiple separate
  10008. TrailDBs. This feature was motivated by the fact that multiple UUIDs
  10009. may correspond to the same logical user and we want to query all events
  10010. related to the user, even if they were stored as separate trails.&lt;/p&gt;
  10011.  
  10012. &lt;h4 id=&quot;command-line-tool-improvements&quot;&gt;Command line tool improvements&lt;/h4&gt;
  10013.  
  10014. &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tdb&lt;/code&gt; command line tool has improved in this release. Here are
  10015. the highlights:&lt;/p&gt;
  10016.  
  10017. &lt;ul&gt;
  10018.  &lt;li&gt;
  10019.    &lt;p&gt;Specify an event filter with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--filter&lt;/code&gt; flag. You can speed
  10020. up filtered queries significantly by creating an index with
  10021. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tdb index&lt;/code&gt;.&lt;/p&gt;
  10022.  &lt;/li&gt;
  10023.  &lt;li&gt;
  10024.    &lt;p&gt;New &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tdb merge&lt;/code&gt; command for merging TrailDBs, even
  10025. when they have mismatching sets of fields.&lt;/p&gt;
  10026.  &lt;/li&gt;
  10027.  &lt;li&gt;
  10028.    &lt;p&gt;You can return a subset of trails with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--uuids&lt;/code&gt; filter.&lt;/p&gt;
  10029.  &lt;/li&gt;
  10030. &lt;/ul&gt;
  10031.  
  10032. &lt;p&gt;Last but not least, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tdb&lt;/code&gt; functionality is now automatically tested
  10033. by Travis for every pull request.&lt;/p&gt;
  10034.  
  10035. &lt;h3 id=&quot;experimental-features&quot;&gt;Experimental Features&lt;/h3&gt;
  10036.  
  10037. &lt;p&gt;We want to keep the core TrailDB very stable and robust. At the same
  10038. time, it is fun and beneficial to experiment with new directions which
  10039. might find their way to the core eventually.&lt;/p&gt;
  10040.  
  10041. &lt;p&gt;A good example of this is a feature that allows you to &lt;a href=&quot;http://tech.adroll.com/blog/data/2016/11/29/traildb-mmap-s3.html&quot;&gt;query TrailDBs
  10042. directly from Amazon S3 without downloading them locally&lt;/a&gt;. This is
  10043. possible thanks to a relatively new feature in the Linux kernel,
  10044. user-space page fault handling, which allows us to download only
  10045. parts of TrailDB on demand with minimal changes to the TrailDB
  10046. codebase. This feature can reduce query latencies significantly if
  10047. your application needs to access only a subset of trails, events, or
  10048. fields.&lt;/p&gt;
  10049.  
  10050. &lt;p&gt;Another experimental feature is &lt;a href=&quot;https://github.com/traildb/reel&quot;&gt;Reel, an AWK-like query language for
  10051. TrailDB&lt;/a&gt;. As mentioned above, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt; is our trusted workhorse for
  10052. expressing user-level queries. Reel was motivated by a particularly
  10053. complex query that needed to be executed over a trillion events. Although
  10054. it is not quite as mature as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt;, you can easily embed and extend it
  10055. for your own use cases.&lt;/p&gt;
  10056.  
  10057. &lt;h3 id=&quot;future-roadmap&quot;&gt;Future Roadmap&lt;/h3&gt;
  10058.  
  10059. &lt;p&gt;TrailDB 0.6 is a robust data backend for applications that need to
  10060. execute complex computation over discrete events over time. As we
  10061. have emphasized before, we take the stability of the C API, ABI, and
  10062. especially the on-disk format very seriously. You can use the 0.6
  10063. release to read TrailDBs created with any previous version of the
  10064. software. This should hold true for any future version of TrailDB as
  10065. well.&lt;/p&gt;
  10066.  
  10067. &lt;p&gt;Our next big focus after this release is to optimize creation of
  10068. TrailDBs. As mentioned above, AdRoll creates multi-trillion event
  10069. TrailDBs weekly. Even when using spot instances on AWS, it costs
  10070. thousands of dollars to create these files based on raw log files. There
  10071. are some easy optimizations that are targeted for the next 0.7 release
  10072. which should drastically lower the cost of creating massive TrailDBs.&lt;/p&gt;
  10073.  
  10074. &lt;p&gt;Meanwhile, we hope that you enjoy the 0.6 release. If you have any
  10075. questions, comments, or contributions, you can reach us at
  10076. &lt;a href=&quot;https://gitter.im/traildb/traildb&quot;&gt;the TrailDB Gitter channel&lt;/a&gt;.&lt;/p&gt;
  10077.  
  10078. </description>
  10079.    </item>
  10080.    
  10081.    
  10082.    
  10083.    <item>
  10084.      <title>trck: querying discrete time series with state machines</title>
  10085.      <link>https://tech.nextroll.com/blog/data-science/2017/03/21/trck.html</link>
  10086.      <pubDate>Tue, 21 Mar 2017 00:00:00 -0700</pubDate>
  10087.      <author></author>
  10088.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data-science/2017/03/21/trck</guid>
  10089.      <description>&lt;p&gt;Last year we released &lt;a href=&quot;http://tech.adroll.com/blog/data/2016/05/24/traildb-open-sourced.html&quot;&gt;TrailDB&lt;/a&gt;, a library we use extensively at AdRoll to efficiently store event data. Today we’re open sourcing &lt;a href=&quot;https://github.com/traildb/trck&quot;&gt;trck&lt;/a&gt;, a query engine complementary to TrailDB that we use to analyze trillions of discrete events in TrailDBs every day.&lt;/p&gt;
  10090.  
  10091. &lt;p&gt;As you may remember, the TrailDB data schema is very simple:&lt;/p&gt;
  10092.  
  10093. &lt;p&gt;&lt;img src=&quot;/images/post_images/trck_traildb_datamodel.png&quot; alt=&quot;traildb_datamodel&quot; /&gt;&lt;/p&gt;
  10094.  
  10095. &lt;p&gt;Every user (or “trail”) has a sequence of timestamped events associated with it. Every event has a number of fields, and the set of fields is fixed per TrailDB.&lt;/p&gt;
  10096.  
  10097. &lt;p&gt;For example, a dataset of Wikipedia edits may contain a user’s IP address and page title as fields:&lt;/p&gt;
  10098.  
  10099. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;# tdb dump -j -i wikipedia-history-small.tdb | head&lt;/span&gt;
  10100. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;86f7f07096e5a9f01509b2b2e9380000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1152738820&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;81.152.155.52&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;ZZZap!&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10101. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;86f7f07096e5a9f01509b2b2e9380000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1152739029&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;81.152.155.52&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;ZZZap!&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10102. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;86f7f07096e5a9f01509b2b2e9380000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1152739044&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;81.152.155.52&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;ZZZap!&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10103. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;86f7f07096e5a9f01509b2b2e9380000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1152739092&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;81.152.155.52&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;ZZZap!&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10104. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;86f7f07096e5a9f01509b2b2e9380000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1152739159&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;81.152.155.52&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;ZZZap!&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10105. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;86f7f07096e5a9f01509b2b2e9380000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1152739280&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;81.152.155.52&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;ZZZap!&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10106. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;2e0774e93410d504bc1720b8ad3d0000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1159025033&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;219.110.177.197&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;Ip dip&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10107. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;237589722f980f576dd156a29d440000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1109300460&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;134.53.178.231&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;Joss sticks&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10108. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;c0ced428660ed821de06446f10720000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1294656212&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;180.193.58.138&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;Indian Rebellion of 1857&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10109. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;uuid&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;221aeaa8b065796bc568e026c1b60000&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;1273111183&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;ip&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;70.111.4.54&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;title&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;Dwight-Englewood School&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  10110.  
  10111. &lt;p&gt;Right now, if you want to query these data you can use &lt;a href=&quot;https://github.com/traildb/traildb-python&quot;&gt;Python&lt;/a&gt; or &lt;a href=&quot;https://github.com/traildb/traildb-r&quot;&gt;R bindings&lt;/a&gt; to TrailDB. So, if we want to count the number of user sessions in a TrailDB we could do this in Python:&lt;/p&gt;
  10112.  
  10113. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sys&lt;/span&gt;
  10114. &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;traildb&lt;/span&gt;
  10115.  
  10116. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;__main__&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  10117.    &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;traildb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TrailDB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  10118.    &lt;span class=&quot;n&quot;&gt;sessions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
  10119.    
  10120.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cookie&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;trail&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;trails&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
  10121.        &lt;span class=&quot;n&quot;&gt;last_timestamp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
  10122.        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;trail&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  10123.            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;last_timestamp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  10124.                &lt;span class=&quot;n&quot;&gt;sessions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
  10125.    
  10126.            &lt;span class=&quot;n&quot;&gt;last_timestamp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;
  10127.    
  10128.    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sessions&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  10129.  
  10130. &lt;p&gt;Here, a user session is defined as a series of events no more than 30 minutes apart. If we run this script on a sample database from the &lt;a href=&quot;http://traildb.io/docs/tutorial/&quot;&gt;TrailDB tutorial&lt;/a&gt;, it processes &lt;a href=&quot;http://traildb.io/data/wikipedia-history-small.tdb&quot;&gt;wikipedia-history-small.tdb&lt;/a&gt; in about 100 seconds on my laptop.&lt;/p&gt;
  10131.  
  10132. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;python example.py wikipedia-history-small.tdb
  10133. 1774765
  10134.  
  10135. real 1m40.645s
  10136. user 1m31.032s
  10137. sys 0m2.963s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  10138.  
  10139. &lt;p&gt;It doesn’t sound like a lot, but that database is pretty tiny: it contains 6.5M events and 450K trails, and takes about 100MB on disk. We routinely use multi-gigabyte TrailDBs containing billions and trillions of events.&lt;/p&gt;
  10140.  
  10141. &lt;p&gt;At large scale, Python starts to struggle to process all the data quickly enough. And it is not so much the Python interpreter itself; if you use &lt;a href=&quot;https://pypy.org/&quot;&gt;PyPy&lt;/a&gt;, or even a completely different language, there is substantial overhead that comes just from decoding and marshalling event data since it is so granular.&lt;/p&gt;
  10142.  
  10143. &lt;p&gt;That made us realize that it would be nice to have a way to query TrailDBs in some way that can bypass all that marshalling. After all, many queries are not so complicated to require a full-blown Turing complete language to express them. Conceptually, something like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;grep&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;awk&lt;/code&gt;, or a regular expression engine would do, except it has to work on structured events with fields and have a concept of time.&lt;/p&gt;
  10144.  
  10145. &lt;h4 id=&quot;trck-regular-expressions-for-traildbs&quot;&gt;trck: “regular expressions” for TrailDBs&lt;/h4&gt;
  10146.  
  10147. &lt;p&gt;One thing to note is that many queries on event-based data have much in common. Usually you want to keep some kind of minimal state per trail (that’d be &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;last_timestamp&lt;/code&gt; in the above example) and then depending on what events you see down the line, you take some kind of action, such as incrementing a session counter.&lt;/p&gt;
  10148.  
  10149. &lt;p&gt;That sounds very similar to regular expressions, or, speaking more generally, finite state machines. For example, the above script can be expressed as a state machine:&lt;/p&gt;
  10150.  
  10151. &lt;p&gt;&lt;img src=&quot;/images/post_images/trck_statem.png&quot; alt=&quot;statem&quot; /&gt;&lt;/p&gt;
  10152.  
  10153. &lt;p&gt;We start at the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;start&lt;/code&gt; state. Then, when we see an event in the trail, we transfer to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sess&lt;/code&gt; state and then on any following event, we keep coming back to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sess&lt;/code&gt;. If we sit in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sess&lt;/code&gt; for 30 minutes and nothing happens, we transfer back to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;start&lt;/code&gt; (and increment session counter).&lt;/p&gt;
  10154.  
  10155. &lt;p&gt;And that’s what our small query language called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt; does: you can express your query as a state machine, with some actions, like incrementing a counter, attached to edges.&lt;/p&gt;
  10156.  
  10157. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-erlang&quot; data-lang=&quot;erlang&quot;&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  10158.    &lt;span class=&quot;k&quot;&gt;receive&lt;/span&gt;
  10159.        &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sess&lt;/span&gt;
  10160.  
  10161. &lt;span class=&quot;n&quot;&gt;sess&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  10162.    &lt;span class=&quot;k&quot;&gt;receive&lt;/span&gt;
  10163.        &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sess&lt;/span&gt;
  10164.    &lt;span class=&quot;k&quot;&gt;after&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;yield&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;$c&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  10165.  
  10166. &lt;p&gt;Then we can easily compile this to C code that works with TrailDB directly. If we save the above snippet as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;wikipedia_example.tr&lt;/code&gt;, call the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt; compiler to produce a binary, and then run it on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;wikipedia-history-small.tdb&lt;/code&gt;:&lt;/p&gt;
  10167.  
  10168. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./bin/trck &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; wikipedia_example.tr
  10169. Compiling wikipedia_example3.tr
  10170. Produced binary &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;matcher-traildb &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;0.54 seconds with clang-omp[openmp]
  10171.  
  10172. &lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt; ./matcher-traildb wikipedia-history-small.tdb 2&amp;gt;/dev/null
  10173. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$count&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;: 1774765 &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  10174. real 0m0.813s
  10175. user 0m2.819s
  10176. sys 0m0.179s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  10177.  
  10178. &lt;p&gt;Note that the evaluation of a state machine can be also trivially parallelized, as there are almost no data dependencies between state machines for different users. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt; compiler employs &lt;a href=&quot;http://www.openmp.org/&quot;&gt;OpenMP&lt;/a&gt; for that.&lt;/p&gt;
  10179.  
  10180. &lt;p&gt;As a result, this example produces the same result as the Python script, except it runs on &lt;a href=&quot;http://traildb.io/data/wikipedia-history-small.tdb&quot;&gt;wikipedia-history-small.tdb&lt;/a&gt; in about 800ms, or 100x faster. And if you run it on a much larger dataset &lt;a href=&quot;http://traildb.io/data/wikipedia-history.tdb&quot;&gt;wikipedia-history.tdb&lt;/a&gt; that contains the full edit log of Wikipedia for fifteen years (5GB, 600M events), it takes about 90 seconds on my laptop and scales very well with the number of cores. On a reasonably sized 16-core EC2 instance, the same query runs in 40 seconds.&lt;/p&gt;
  10181.  
  10182. &lt;p&gt;There is a lot more you can do with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt;: for example, in addition to counters it supports set and multiset types. There are also nested timeouts on states and some interesting higher-level optimizations that allow it to skip parts of trails that have no chance of altering the state. Check out the &lt;a href=&quot;https://github.com/traildb/trck&quot;&gt;README.md&lt;/a&gt; in the repository.&lt;/p&gt;
  10183.  
  10184. &lt;h4 id=&quot;implementation&quot;&gt;Implementation&lt;/h4&gt;
  10185.  
  10186. &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt; compiler front end is implemented in Python using &lt;a href=&quot;http://www.dabeaz.com/ply/&quot;&gt;PLY&lt;/a&gt;. It takes the state machine and compiles it to C code, that is then linked with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;trck&lt;/code&gt; runtime library (written in C99), producing a static binary. It is tested on Linux and on OS X. It is now available under the MIT license on &lt;a href=&quot;https://github.com/traildb/trck&quot;&gt;GitHub&lt;/a&gt; .&lt;/p&gt;
  10187.  
  10188. &lt;p&gt;If you have questions, you’re welcome to join our &lt;a href=&quot;gitter.im/traildb/traildb&quot;&gt;TrailDB Gitter channel&lt;/a&gt;.&lt;/p&gt;
  10189. </description>
  10190.    </item>
  10191.    
  10192.    
  10193.    
  10194.    <item>
  10195.      <title>Thompson Sampling and Bayesian Factorization Machines</title>
  10196.      <link>https://tech.nextroll.com/blog/data-science/2017/03/06/thompson-sampling-bayesian-factorization-machines.html</link>
  10197.      <pubDate>Mon, 06 Mar 2017 00:00:00 -0800</pubDate>
  10198.      <author></author>
  10199.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data-science/2017/03/06/thompson-sampling-bayesian-factorization-machines</guid>
  10200.      <description>&lt;p&gt;On the Data Science Engineering team, we are constantly working to improve the
  10201. machine learning systems powering AdRoll’s products. We have recently started
  10202. investigating Thompson sampling and Bayesian Factorization Machines as a way to
  10203. ensure efficient exploration of the ad marketplace. In this post, I will
  10204. illustrate why such exploration is necessary, and then dive into some of the
  10205. math and algorithms required to make it work. Credit for inspiring this project
  10206. goes to the paper &lt;a href=&quot;http://research.cs.rutgers.edu/~lihong/pub/Chapelle12Empirical.pdf&quot;&gt;“An Empirical Evaluation of Thompson Sampling,”&lt;/a&gt;
  10207. by Olivier Chapelle and Lihong Li. We worked through a derivation of their
  10208. approach, and then extended it to work with our &lt;a href=&quot;http://tech.adroll.com/blog/data-science/2015/08/25/factorization-machines.html&quot;&gt;Factorization Machines&lt;/a&gt;
  10209. model.&lt;/p&gt;
  10210.  
  10211. &lt;h2 id=&quot;motivation&quot;&gt;Motivation&lt;/h2&gt;
  10212.  
  10213. &lt;p&gt;One of the central problems we solve at AdRoll is deciding which ad to show to
  10214. a given user browsing a given page on the Web. We have to make these decisions
  10215. in an environment that is constantly changing, as new ads get added to our
  10216. system, old ads get retired, and users appear, disappear, or change their
  10217. browsing behaviors. In mathematical terms, we say that the distribution we are
  10218. trying to model is &lt;em&gt;non-stationary&lt;/em&gt;.&lt;/p&gt;
  10219.  
  10220. &lt;p&gt;This non-stationary regime means that there is always some uncertainty about
  10221. which decision is optimal, and each sequential decision gives us additional
  10222. information that we can use to make better decisions in the future. In other
  10223. words, we have a classical &lt;a href=&quot;https://webdocs.cs.ualberta.ca/~sutton/book/ebook/node7.html&quot;&gt;&lt;em&gt;exploration vs. exploitation tradeoff&lt;/em&gt;&lt;/a&gt;
  10224. from reinforcement learning.&lt;/p&gt;
  10225.  
  10226. &lt;p&gt;To illustrate this, consider a simple example where we have two ads to choose
  10227. from: Ad 1 has a predicted conversion probability of 0.123, and we have shown
  10228. this ad many times before, so we are fairly certain in our estimate. Ad 2 has a
  10229. predicted conversion probability of 0.111, but we’ve only shown it a few times,
  10230. so its true value could be a bit higher or a bit lower. We have two options:&lt;/p&gt;
  10231.  
  10232. &lt;ul&gt;
  10233.  &lt;li&gt;&lt;em&gt;Exploit&lt;/em&gt; what we already know, and choose Ad 1 because it has a higher
  10234. predicted conversion probability.&lt;/li&gt;
  10235.  &lt;li&gt;&lt;em&gt;Explore&lt;/em&gt; and choose Ad 2, in order to get more knowledge about it and
  10236. improve our decisions in the future.&lt;/li&gt;
  10237. &lt;/ul&gt;
  10238.  
  10239. &lt;p&gt;After sufficient exploration, the conversion probability of Ad 2 could turn out
  10240. to be 0.100, in which case Ad 1 is much better and the exploration did not pay
  10241. off. On the other hand, Ad 2’s probability could turn out to be 0.130, in which
  10242. case it is better than Ad 1, and we wouldn’t have discovered this without
  10243. taking a chance on Ad 2. Exploration allows us to make better-informed
  10244. decisions going forward, but there is no way to tell ahead of time whether we
  10245. will find a better ad this way.&lt;/p&gt;
  10246.  
  10247. &lt;p&gt;It is clear that some amount of exploration is necessary, otherwise our system
  10248. would always show old ads known to perform well, and never show any new ads
  10249. that our customers upload. On the other hand, it is also clear that excessive
  10250. exploration is wasteful: once we have done enough exploration to determine that
  10251. Ad 1 is much better than Ad 2, there is no need to continue showing Ad 2
  10252. anymore.&lt;/p&gt;
  10253.  
  10254. &lt;p&gt;One naïve explore/exploit strategy is to always explore on X% of traffic, by
  10255. showing a random ad, and always exploit on the remaining (100 - X)% of traffic,
  10256. by showing the best-known ad. This &lt;a href=&quot;https://webdocs.cs.ualberta.ca/~sutton/book/ebook/node16.html&quot;&gt;epsilon-greedy strategy&lt;/a&gt; is
  10257. simple to implement, but it is hard to tell if it performs too much or too
  10258. little exploration. Ideally, we would like a system that adjusts the amount of
  10259. exploration dynamically, based on how uncertain it is about the value of each
  10260. decision.&lt;/p&gt;
  10261.  
  10262. &lt;p&gt;Now that we understand the problem intuitively, it’s time to formalize it and
  10263. describe our solution.&lt;/p&gt;
  10264.  
  10265. &lt;h2 id=&quot;contextual-bandits&quot;&gt;Contextual Bandits&lt;/h2&gt;
  10266.  
  10267. &lt;p&gt;Our problem can be formulated as a &lt;a href=&quot;https://en.wikipedia.org/wiki/Multi-armed_bandit#Contextual_Bandit&quot;&gt;contextual multi-armed bandit&lt;/a&gt;,
  10268. where we are playing a sequential decision game. At each iteration, we get a
  10269. &lt;em&gt;context vector&lt;/em&gt; \(x\), which is a feature vector containing all the relevant
  10270. information about the user and the page they are visiting. We choose an action
  10271. \(a\) from a set of available actions \(A\), which in our case is the set of
  10272. ads to show. Some time later, we receive a reward \(r\), which is 1 if the user
  10273. converted, and 0 otherwise. Our goal is to choose actions in a way that
  10274. maximizes cumulative reward.&lt;/p&gt;
  10275.  
  10276. &lt;p&gt;One obvious way to approach action selection is to build a predictive model of
  10277. the reward given a context and an action: \(P(r|x,a)\). We can use our existing
  10278. &lt;a href=&quot;http://tech.adroll.com/blog/data-science/2015/08/25/factorization-machines.html&quot;&gt;Factorization Machines&lt;/a&gt; (FM) model to do this, by just merging the context
  10279. features \(x\) and the ad features \(a\) into a single feature vector. At each
  10280. iteration, we would “score” each action \(a\) by computing \(P(r|x,a)\) using
  10281. our model, and then we’d have to decide whether to explore or exploit.&lt;/p&gt;
  10282.  
  10283. &lt;h2 id=&quot;thompson-sampling&quot;&gt;Thompson Sampling&lt;/h2&gt;
  10284.  
  10285. &lt;p&gt;It turns out that by taking a Bayesian approach, we can solve our
  10286. explore/exploit dilemma in a very elegant way. Consider our model from before,
  10287. \(P(r|x,a)\). This model is parameterized by some set of parameters \(\theta\),
  10288. for example, the set of weights in logistic regression or FM. In a non-Bayesian
  10289. setting, we have some optimization algorithm that finds a maximum-likelihood
  10290. estimate for \(\theta\), which is equivalent to minimizing logistic loss on the
  10291. training set. In a Bayesian setting, we place a prior \(P(\theta)\) on the
  10292. parameters, and use Bayes’ Rule to express the posterior after observing some
  10293. training data:&lt;/p&gt;
  10294.  
  10295. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10296. %&lt;![CDATA[
  10297. P(\theta|x_{1:N},a_{1:N},r_{1:N}) \propto P(\theta) \prod_{i=1}^N P(r_i|x_i,a_i,\theta)
  10298. %]]&gt;
  10299. &lt;/script&gt;
  10300.  
  10301. &lt;p&gt;We could use this to obtain a maximum a posteriori estimate of \(P(r|x,a)\) by
  10302. integrating over the unobserved \(\theta\), but that is not our goal. Instead,
  10303. we will use the simple but powerful idea of &lt;a href=&quot;https://en.wikipedia.org/wiki/Thompson_sampling&quot;&gt;&lt;em&gt;Thompson sampling&lt;/em&gt;&lt;/a&gt; to
  10304. solve our explore/exploit dilemma. This consists of two steps:&lt;/p&gt;
  10305.  
  10306. &lt;ol&gt;
  10307.  &lt;li&gt;Sample a set of parameters according to the posterior after seeing the
  10308. training data: \(\theta&apos; \sim P(\theta|x_{1:N},a_{1:N},r_{1:N})\).&lt;/li&gt;
  10309.  &lt;li&gt;Choose the action that is optimal with respect to the sampled set of
  10310. parameters: \(\hat{a} = \mathop{\arg\max}_a P(r|x,a,\theta&apos;)\).&lt;/li&gt;
  10311. &lt;/ol&gt;
  10312.  
  10313. &lt;p&gt;For an intuitive understanding of Thompson sampling, consider what would happen
  10314. if we had very little training data. The posterior over parameters would be
  10315. broad, and the \(\theta&apos;\) samples in each iteration would have high variance.
  10316. Thus, there would be a lot of randomness in the chosen actions \(\hat{a}\),
  10317. which is exactly the definition of exploration. As we collect more training
  10318. data, the posterior over parameters becomes more peaked, and both the samples
  10319. of \(\theta&apos;\) and the chosen actions \(\hat{a}\) become more stable, which is
  10320. exactly the definition of exploitation. To summarize, the system starts out
  10321. with a lot of exploration, and automatically adjusts towards more exploitation
  10322. as it become more certain in its belief about the parameters.&lt;/p&gt;
  10323.  
  10324. &lt;p&gt;Notice that in our introductory example, we talked about exploration at the &lt;em&gt;ad
  10325. level&lt;/em&gt;: Ad 1 had a high conversion rate with high certainty, while Ad 2 had a
  10326. lower conversion rate but with a lot of uncertainty. Thompson sampling is a
  10327. little more subtle, because it performs exploration at the &lt;em&gt;parameter level&lt;/em&gt;;
  10328. it will automatically explore to decrease uncertainty about the parameters in
  10329. the model. This is an advantage in our application, since we use the &lt;a href=&quot;https://en.wikipedia.org/wiki/Feature_hashing&quot;&gt;hashing
  10330. trick&lt;/a&gt; to keep our parameter space fixed, while ads appear and
  10331. disappear all the time.&lt;/p&gt;
  10332.  
  10333. &lt;p&gt;Now that we understand how Thompson sampling works, let’s see how to represent
  10334. and update the posterior over parameters in the case of logistic regression and
  10335. factorization machines.&lt;/p&gt;
  10336.  
  10337. &lt;h2 id=&quot;common-notation&quot;&gt;Common Notation&lt;/h2&gt;
  10338.  
  10339. &lt;p&gt;We have a training set of \(N\) examples \((\mathbf{x}_i, y_i)\), where the
  10340. feature vector is a \(D\)-dimensional vector: \(\mathbf{x}_i \in \mathbb{R}^D\)
  10341. and the label is binary: \(y_i \in \{0, 1\}\). We will use \(i\) to index
  10342. examples whenever possible.&lt;/p&gt;
  10343.  
  10344. &lt;p&gt;For a linear model (traditional logistic regression), the parameters \(\theta\)
  10345. are just a weight vector \(\mathbf{w} \in \mathbb{R}^D\). We will use \(j\) to
  10346. index into the weight vector whenever possible.&lt;/p&gt;
  10347.  
  10348. &lt;p&gt;For a factorization machines (FM) model, the parameters \(\theta\) are a vector
  10349. \(\mathbf{w} \in \mathbb{R}^D\) as above, and also a matrix \(\mathbf{V} \in
  10350. \mathbb{R}^{D \times K}\), where \(K\) is the number of factors. We will use
  10351. \(j\) to index into \(\mathbf{w}\) and \(jk\) to index into \(\mathbf{V}\)
  10352. whenever possible (we omit the comma when we have multiple indices).&lt;/p&gt;
  10353.  
  10354. &lt;p&gt;We define \(\mu_i\) to be the probability of a positive label for example
  10355. \(i\):&lt;/p&gt;
  10356.  
  10357. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10358. %&lt;![CDATA[
  10359. \mu_i ~=~ p(y_i=1 | \mathbf{x}_i) ~=~ \frac{1}{1 + \exp(-F(\mathbf{x}_i))}
  10360. %]]&gt;
  10361. &lt;/script&gt;
  10362.  
  10363. &lt;p&gt;where \(F\) is a &lt;em&gt;kernel&lt;/em&gt;, or &lt;em&gt;model equation&lt;/em&gt;. For the linear case, \(F\) is
  10364. just a dot product of the weights with the features, and for factorization
  10365. machines, \(F\) captures second-order interactions:&lt;/p&gt;
  10366.  
  10367. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10368. %&lt;![CDATA[
  10369. \begin{align}
  10370. F_\text{linear}(\mathbf{x}_i) &amp;= \sum_{j=1}^D{ w_j x_{ij} } \\
  10371. F_\text{FM}(\mathbf{x}_i) &amp;=
  10372.    \sum_{j=1}^D{ w_j x_{ij} } +
  10373.    \sum_{a=1}^D{ \sum_{b=a+1}^D{ \langle \mathbf{V}_a, \mathbf{V}_b \rangle x_{ia} x_{ib} } }
  10374. \end{align}
  10375. %]]&gt;
  10376. &lt;/script&gt;
  10377.  
  10378. &lt;p&gt;(For simplicity, we omit the bias term in each model.)&lt;/p&gt;
  10379.  
  10380. &lt;p&gt;In both cases, the log-likelihood of the training set is given by:&lt;/p&gt;
  10381.  
  10382. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10383. %&lt;![CDATA[
  10384. \text{loglik}(\theta) = \sum_{i=1}^N{( y_i \ln \mu_i + (1-y_i) \ln (1-\mu_i) )}
  10385. %]]&gt;
  10386. &lt;/script&gt;
  10387.  
  10388. &lt;p&gt;where the \(\mu_i\) terms use either \(F_\text{linear}\) or \(F_\text{FM}\)
  10389. depending on the model.&lt;/p&gt;
  10390.  
  10391. &lt;h2 id=&quot;the-laplace-approximation&quot;&gt;The Laplace Approximation&lt;/h2&gt;
  10392.  
  10393. &lt;p&gt;For a Bayesian treatment of logistic regression, we want to put a prior on the
  10394. parameters \(\theta\) and compute the posterior given observed data. We choose
  10395. a simple prior where the parameters are independent, and each parameter is
  10396. drawn from a normal distribution with its own mean \(m\) and precision \(s\):&lt;/p&gt;
  10397.  
  10398. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10399. %&lt;![CDATA[
  10400. \begin{align}
  10401. w_j &amp;\sim \mathcal{N}(m_j, s_j^{-1}) \\
  10402. v_{jk} &amp;\sim \mathcal{N}(m_{jk}, s_{jk}^{-1})
  10403. \end{align}
  10404. %]]&gt;
  10405. &lt;/script&gt;
  10406.  
  10407. &lt;p&gt;Using Bayes’ Rule, the posterior is proportional to the prior and likelihood:&lt;/p&gt;
  10408.  
  10409. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10410. %&lt;![CDATA[
  10411. p(\mathbf{w} | \mathbf{x}_{1..N}, y_{1..N}) ~\propto~
  10412. p(\mathbf{w}) \cdot p(y_{1..N} | \mathbf{w}, \mathbf{x}_{1..N})
  10413. %]]&gt;
  10414. &lt;/script&gt;
  10415.  
  10416. &lt;p&gt;Unfortunately, this posterior is not a normal distribution. But we can
  10417. approximate it with a normal distribution by using the Laplace approximation
  10418. (see section 4.4 in Bishop’s &lt;a href=&quot;https://www.amazon.com/Pattern-Recognition-Learning-Information-Statistics/dp/0387310738&quot;&gt;“Pattern Recognition and Machine
  10419. Learning”&lt;/a&gt; textbook). That way, the approximated posterior will have
  10420. the same form as the prior, allowing us to iteratively update our belief about
  10421. the parameters.&lt;/p&gt;
  10422.  
  10423. &lt;p&gt;The Laplace approximation is a technique for approximating an arbitrary
  10424. distribution \(p(z) = c^{-1} f(z)\) (where \(f\) is an arbitrary function and
  10425. \(c\) is a normalization term) with a normal distribution \(q(z) =
  10426. \mathcal{N}(z_0, A^{-1})\) where \(z_0\) is a mode of \(f(z)\), and \(A\) is
  10427. the negative second derivative of \(\ln f(z)\), evaluated at \(z_0\):&lt;/p&gt;
  10428.  
  10429. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10430. %&lt;![CDATA[
  10431. \begin{align}
  10432. \frac{\mathrm{d} f(z)}{\mathrm{d} z} &amp;\Bigr|_{z=z_0} = 0 \\
  10433. A = \frac{-\mathrm{d}^2 \ln f(z)}{\mathrm{d} z^2} &amp;\Bigr|_{z=z_0}
  10434. \end{align}
  10435. %]]&gt;
  10436. &lt;/script&gt;
  10437.  
  10438. &lt;p&gt;The first step is to find a mode of the posterior distribution. We can do this
  10439. by finding a maximum a posteriori (MAP) estimate of the parameters. This is
  10440. equivalent to maximizing the log-posterior:&lt;/p&gt;
  10441.  
  10442. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10443. %&lt;![CDATA[
  10444. \text{logpost}(\theta) = \ln p(\theta) + \text{loglik}(\theta)
  10445. %]]&gt;
  10446. &lt;/script&gt;
  10447.  
  10448. &lt;p&gt;The second step is to find the \(A\) term, which is the negative second
  10449. derivative of \(\text{logpost}(\theta)\) with respect to each parameter. (We
  10450. keep things simple by forcing our parameters to be independent.)&lt;/p&gt;
  10451.  
  10452. &lt;p&gt;Before making these calculations concrete for the linear and FM kernels, we
  10453. show a general result that will greatly simplify those calculations. Consider
  10454. the partial derivative of \(\text{loglik}(\theta)\) with respect to some
  10455. variable \(z\):&lt;/p&gt;
  10456.  
  10457. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10458. %&lt;![CDATA[
  10459. \frac{\partial}{\partial z} \text{loglik}(\theta) ~=~
  10460. \sum_{i=1}^N{
  10461.    \left( \frac{y_i}{\mu_i} - \frac{1-y_i}{1-\mu_i} \right)
  10462.    \frac{\partial \mu_i}{\partial F(\mathbf{x}_i)}
  10463.    \frac{\partial F(\mathbf{x}_i)}{\partial z}
  10464. }
  10465. %]]&gt;
  10466. &lt;/script&gt;
  10467.  
  10468. &lt;p&gt;Note that \(\mu_i\) is a logistic function of \(F(\mathbf{x}_i)\), with the
  10469. useful property that \(\partial \mu_i / \partial F(\mathbf{x}_i) = \mu_i (1 -
  10470. \mu_i)\). Therefore our derivative simplifies to:&lt;/p&gt;
  10471.  
  10472. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10473. %&lt;![CDATA[
  10474. \frac{\partial}{\partial z} \text{loglik}(\theta) ~=~
  10475. \sum_{i=1}^N{ (y_i - \mu_i) \frac{\partial F(\mathbf{x}_i)}{\partial z} }
  10476. %]]&gt;
  10477. &lt;/script&gt;
  10478.  
  10479. &lt;p&gt;We now take the second derivative:&lt;/p&gt;
  10480.  
  10481. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10482. %&lt;![CDATA[
  10483. \frac{\partial^2}{\partial z^2} \text{loglik}(\theta) ~=~
  10484. \sum_{i=1}^N{\left(
  10485.    - \mu_i (1 - \mu_i) \left(\frac{\partial F(\mathbf{x}_i)}{\partial z}\right)^2
  10486.    + (y_i - \mu_i) \frac{\partial^2 F(\mathbf{x}_i)}{\partial z^2}
  10487. \right)}
  10488. %]]&gt;
  10489. &lt;/script&gt;
  10490.  
  10491. &lt;p&gt;To make this concrete for the linear and FM kernels, we only need to evaluate
  10492. the terms \(\frac{\partial F(\mathbf{x}_i)}{\partial z}\) and
  10493. \(\frac{\partial^2 F(\mathbf{x}_i)}{\partial z^2}\) for the relevant model
  10494. equation \(F\) and parameter \(z\), and plug them into the equation above.&lt;/p&gt;
  10495.  
  10496. &lt;h2 id=&quot;bayesian-logistic-regression&quot;&gt;Bayesian Logistic Regression&lt;/h2&gt;
  10497.  
  10498. &lt;p&gt;For logistic regression with the traditional linear kernel, the log-posterior
  10499. is:&lt;/p&gt;
  10500.  
  10501. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10502. %&lt;![CDATA[
  10503. \begin{align}
  10504. \text{logpost}(\mathbf{w}) &amp;~=~ \ln p(\mathbf{w}) + \text{loglik}(\mathbf{w}) \nonumber \\
  10505.    &amp;~=~
  10506.    \text{const}
  10507.    - \frac{1}{2} \sum_{j=1}^D{ s_j (w_j - m_j)^2 }
  10508.    + \sum_{i=1}^N{( y_i \ln \mu_i + (1-y_i) \ln (1-\mu_i) )}
  10509. \tag{1}
  10510. \end{align}
  10511. %]]&gt;
  10512. &lt;/script&gt;
  10513.  
  10514. &lt;p&gt;With a bit of work, the second derivative of the log-posterior turns out to be:&lt;/p&gt;
  10515.  
  10516. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10517. %&lt;![CDATA[
  10518. \begin{align}
  10519. \label{a-lr}
  10520. A_j = - \frac{\partial^2}{\partial w_j^2} \text{logpost}(\mathbf{w})
  10521. &amp;~=~
  10522.    - \frac{\partial^2}{\partial w_j^2} \ln p(\mathbf{w})
  10523.    - \frac{\partial^2}{\partial w_j^2} \text{loglik}(\mathbf{w}) \nonumber \\
  10524. &amp;~=~
  10525.    s_j + \sum_{i=1}^N{ \mu_i (1 - \mu_i) x_{ij}^2 }
  10526. \tag{2}
  10527. \end{align}
  10528. %]]&gt;
  10529. &lt;/script&gt;
  10530.  
  10531. &lt;p&gt;We now have everything we need to express the Bayesian Logistic Regression
  10532. algorithm. It maintains the following invariant: after processing \(t\) batches,
  10533. the weights are distributed as \(w_j \sim \mathcal{N}(m_j^{(t)}, 1/s_j^{(t)})\),
  10534. which is an approximation of the true posterior after all the data observed so
  10535. far. In the algorithm below, we update the means \(m_j\) and precisions \(s_j\) in
  10536. place, so we omit the superscripts \((t)\).&lt;/p&gt;
  10537.  
  10538. &lt;hr /&gt;
  10539.  
  10540. &lt;p&gt;&lt;strong&gt;Algorithm: Bayesian Logistic Regression&lt;/strong&gt;&lt;/p&gt;
  10541.  
  10542. &lt;ul&gt;
  10543.  &lt;li&gt;Initialize the prior on each weight \(w_j\) with \(m_j = 0\), \(s_j = \lambda\).&lt;/li&gt;
  10544.  &lt;li&gt;For each new batch of training data \((\mathbf{x}_i, y_i)\) for \(i = 1 .. N\):
  10545.    &lt;ul&gt;
  10546.      &lt;li&gt;Find \(\hat{\mathbf{w}}\) maximizing equation (1) by numerical optimization.&lt;/li&gt;
  10547.      &lt;li&gt;Compute \(A_j\) for each weight according to equation (2).&lt;/li&gt;
  10548.      &lt;li&gt;Update the weight distribution: \(m_j \gets \hat{w}_j\) and \(s_j \gets A_j\).&lt;/li&gt;
  10549.    &lt;/ul&gt;
  10550.  &lt;/li&gt;
  10551. &lt;/ul&gt;
  10552.  
  10553. &lt;hr /&gt;
  10554.  
  10555. &lt;p&gt;Our priors have a natural interpretation. By starting with the prior
  10556. \(w_j \sim \mathcal{N}(0, \lambda^{-1})\), our objective function in equation
  10557. (1) is exactly the same as for (non-Bayesian) Logistic Regression with an L2
  10558. regularizer \(\lambda\).&lt;/p&gt;
  10559.  
  10560. &lt;p&gt;The algorithm above matches algorithm 3 derived by Chapelle and Li,
  10561. confirming that we have derived the Laplace approximation correctly. The only
  10562. difference is one of notation: our labels are in \(\{0, 1\}\) while their labels
  10563. are in \(\{-1, 1\}\), so their objective function has a slightly different form.&lt;/p&gt;
  10564.  
  10565. &lt;h2 id=&quot;bayesian-factorization-machines&quot;&gt;Bayesian Factorization Machines&lt;/h2&gt;
  10566.  
  10567. &lt;p&gt;For logistic regression with the FM kernel, the log-posterior is:&lt;/p&gt;
  10568.  
  10569. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10570. %&lt;![CDATA[
  10571. \begin{align}
  10572. \text{logpost}(\mathbf{w}, \mathbf{V}) &amp;~=~ \ln p(\mathbf{w}, \mathbf{V}) + \text{loglik}(\mathbf{w}, \mathbf{V}) \nonumber \\
  10573.    &amp;~=~
  10574.    \text{const}
  10575.    - \frac{1}{2} \sum_{j=1}^D{ s_j (w_j - m_j)^2 }
  10576.    - \frac{1}{2} \sum_{j=1}^D \sum_{k=1}^K{ s_{jk} (v_{jk} - m_{jk})^2 } \nonumber \\
  10577.    &amp;\phantom{~=~ const}
  10578.    + \sum_{i=1}^N{( y_i \ln \mu_i + (1-y_i) \ln (1-\mu_i) )}
  10579. \tag{3}
  10580. \end{align}
  10581. %]]&gt;
  10582. &lt;/script&gt;
  10583.  
  10584. &lt;p&gt;Since our parameters are the weights \(w_j\) and the \(\mathbf{V}\) matrix
  10585. entries \(v_{jk}\), we need second derivatives with respect to both of these in
  10586. order to use the Laplace approximation.&lt;/p&gt;
  10587.  
  10588. &lt;p&gt;The second derivative of the log-posterior w.r.t. \(w_j\) has the same form as
  10589. for the linear kernel, except that the \(\mu_i\) terms now use the FM kernel:&lt;/p&gt;
  10590.  
  10591. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10592. %&lt;![CDATA[
  10593. \begin{align}
  10594. A_j = - \frac{\partial^2}{\partial w_j^2} \text{logpost}(\mathbf{w}, \mathbf{V})
  10595. &amp;~=~
  10596.    - \frac{\partial^2}{\partial w_j^2} \ln p(\mathbf{w}, \mathbf{V})
  10597.    - \frac{\partial^2}{\partial w_j^2} \text{loglik}(\mathbf{w}, \mathbf{V}) \nonumber \\
  10598. &amp;~=~
  10599.    s_j + \sum_{i=1}^N{ \mu_i (1 - \mu_i) x_{ij}^2 }
  10600. \tag{4}
  10601. \end{align}
  10602. %]]&gt;
  10603. &lt;/script&gt;
  10604.  
  10605. &lt;p&gt;The second derivative of the log-posterior w.r.t. \(v_{jk}\) turns out to be a
  10606. bit more complicated, but still easily computed with a single additional pass
  10607. through the data:&lt;/p&gt;
  10608.  
  10609. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10610. %&lt;![CDATA[
  10611. \begin{align}
  10612. A_{jk} = - \frac{\partial^2}{\partial v_{jk}^2} \text{logpost}(\mathbf{w}, \mathbf{V})
  10613. &amp;~=~
  10614.    - \frac{\partial^2}{\partial v_{jk}^2} \ln p(\mathbf{w}, \mathbf{V})
  10615.    - \frac{\partial^2}{\partial v_{jk}^2} \text{loglik}(\mathbf{w}, \mathbf{V}) \nonumber \\
  10616. &amp;~=~
  10617.    s_{jk} + \sum_{i=1}^N{ \mu_i (1 - \mu_i) \left(
  10618.        \frac{\partial F_\text{FM}(\mathbf{x}_i)}{\partial v_{jk}} \right)^2 }
  10619. \tag{5}
  10620. \end{align}
  10621. %]]&gt;
  10622. &lt;/script&gt;
  10623.  
  10624. &lt;p&gt;where&lt;/p&gt;
  10625.  
  10626. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  10627. %&lt;![CDATA[
  10628. \begin{equation}
  10629. \frac{\partial}{\partial v_{jk}} F_\text{FM}(\mathbf{x}_i) ~=~
  10630.    \left( \sum_{j&apos;=1}^D{ v_{j&apos;k} x_{ij&apos;} } \right) x_{ij} - v_{jk} x_{ij}^2
  10631. \end{equation}
  10632. %]]&gt;
  10633. &lt;/script&gt;
  10634.  
  10635. &lt;p&gt;We show the Bayesian Factorization Machines algorithm below. It maintains the
  10636. following invariant: after processing \(t\) batches, the weights are
  10637. distributed as \(w_j \sim \mathcal{N}(m_j^{(t)}, 1/s_j^{(t)})\) and the
  10638. elements of \(\mathbf{V}\) are distributed as \(v_{jk} \sim
  10639. \mathcal{N}(m_{jk}^{(t)}, 1/s_{jk}^{(t)})\), which is an approximation of the
  10640. true posterior after all the data observed so far. In the algorithm below, we
  10641. update the means \(m_j\), \(m_{jk}\) and precisions \(s_j\), \(s_{jk}\) in
  10642. place, so we omit the superscripts \((t)\).&lt;/p&gt;
  10643.  
  10644. &lt;hr /&gt;
  10645.  
  10646. &lt;p&gt;&lt;strong&gt;Algorithm: Bayesian Factorization Machines&lt;/strong&gt;&lt;/p&gt;
  10647.  
  10648. &lt;ul&gt;
  10649.  &lt;li&gt;Initialize the prior on each weight \(w_j\) with \(m_j = 0\), \(s_j = \lambda_\mathbf{w}\).&lt;/li&gt;
  10650.  &lt;li&gt;Initialize the prior on each \(\mathbf{V}\) element \(v_{jk}\) with \(m_{jk} = 0\), \(s_{jk} = \lambda_\mathbf{V}\).&lt;/li&gt;
  10651.  &lt;li&gt;For each new batch of training data \((\mathbf{x}_i, y_i)\) for \(i = 1 .. N\):
  10652.    &lt;ul&gt;
  10653.      &lt;li&gt;Find \(\hat{\mathbf{w}}\), \(\hat{\mathbf{V}}\) maximizing equation (3) by numerical optimization.&lt;/li&gt;
  10654.      &lt;li&gt;Compute \(A_j\) for each weight according to equation (4).&lt;/li&gt;
  10655.      &lt;li&gt;Compute \(A_{jk}\) for each element of \(\mathbf{V}\) according to equation (5).&lt;/li&gt;
  10656.      &lt;li&gt;Update the weight distribution: \(m_j \gets \hat{w}_j\) and \(s_j \gets A_j\).&lt;/li&gt;
  10657.      &lt;li&gt;Update the \(\mathbf{V}\) matrix distribution: \(m_{jk} \gets \hat{V}_{jk}\) and \(s_{jk} \gets A_{jk}\).&lt;/li&gt;
  10658.    &lt;/ul&gt;
  10659.  &lt;/li&gt;
  10660. &lt;/ul&gt;
  10661.  
  10662. &lt;hr /&gt;
  10663.  
  10664. &lt;p&gt;As for the linear kernel, our priors have a natural interpretation. By
  10665. starting with the priors \(w_j \sim \mathcal{N}(0, \lambda_\mathbf{w}^{-1})\)
  10666. and \(v_{jk} \sim \mathcal{N}(0, \lambda_\mathbf{V}^{-1})\), our objective
  10667. function in equation (3) is exactly the same as for (non-Bayesian)
  10668. Factorization Machines with L2 regularizers \(\lambda_\mathbf{w}\) on
  10669. \(\mathbf{w}\) and \(\lambda_\mathbf{V}\) on \(\mathbf{V}\).&lt;/p&gt;
  10670.  
  10671. &lt;h2 id=&quot;final-words&quot;&gt;Final Words&lt;/h2&gt;
  10672.  
  10673. &lt;p&gt;We have implemented Thompson sampling and Bayesian Factorization Machines, and
  10674. we are testing these methods to efficiently navigate between exploration and
  10675. exploitation. We can fine-tune the desired amount of exploration by scaling the
  10676. linear precisions \(s_j\) and FM precisions \(s_{jk}\) by two respective
  10677. constants. This might be useful, for example, if we change how much training
  10678. data we use or how many free parameters are in our model, and we want to
  10679. preserve a certain amount of exploration. Ultimately, these techniques help us
  10680. ensure that we are using our customers’ budgets in the most efficient way
  10681. possible.&lt;/p&gt;
  10682.  
  10683. &lt;p&gt;I’d like to return to our motivating example and show how Thompson sampling
  10684. works in such a scenario. We evaluate two candidate ads, showing their
  10685. non-Thompson conversion probability in red, and a histogram of 10,000
  10686. Thompson-sampled conversion probabilities in blue.&lt;/p&gt;
  10687.  
  10688. &lt;p&gt;&lt;img src=&quot;/images/post_images/thompson-query-1.png&quot; alt=&quot;hll-perf&quot; /&gt;&lt;/p&gt;
  10689.  
  10690. &lt;p&gt;&lt;img src=&quot;/images/post_images/thompson-query-2.png&quot; alt=&quot;hll-perf&quot; /&gt;&lt;/p&gt;
  10691.  
  10692. &lt;p&gt;The predicted conversion rate is 0.120 for the first ad and 0.080 for the
  10693. second ad.  So a system with no exploration would always pick the first ad.
  10694. But the Thompson samples show that there is a lot more uncertainty about the
  10695. conversion rate of the second ad. (The standard deviation is 0.045 for the
  10696. first ad and 0.093 for the second ad.) This means that Thompson sampling will
  10697. pick the second ad some fraction of the time, thus performing exploration. Once
  10698. the variances shrink enough that the choice of best ad is unambiguous, Thompson
  10699. sampling will automatically start exploiting that ad.&lt;/p&gt;
  10700.  
  10701. &lt;p&gt;This has been a really fun project, and the combination of math and engineering
  10702. involved is a good illustration of the type of work that we do on the Data
  10703. Science Engineering team. If this piques your interest, check out our &lt;a href=&quot;https://www.adroll.com/about/careers&quot;&gt;open
  10704. positions&lt;/a&gt;!&lt;/p&gt;
  10705.  
  10706. </description>
  10707.    </item>
  10708.    
  10709.    
  10710.    
  10711.    <item>
  10712.      <title>Querying Data in Amazon S3 Directly with User-Space Page Fault Handling</title>
  10713.      <link>https://tech.nextroll.com/blog/data/2016/11/29/traildb-mmap-s3.html</link>
  10714.      <pubDate>Tue, 29 Nov 2016 00:00:00 -0800</pubDate>
  10715.      <author></author>
  10716.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2016/11/29/traildb-mmap-s3</guid>
  10717.      <description>&lt;p&gt;&lt;em&gt;tl;dr&lt;/em&gt; AdRoll’s real-time bidding system produces tens of billions
  10718. events daily. We store these events in &lt;a href=&quot;http://traildb.io&quot;&gt;TrailDBs&lt;/a&gt;,
  10719. a fast embedded database that &lt;a href=&quot;http://tech.adroll.com/blog/data/2016/05/24/traildb-open-sourced.html&quot;&gt;we recently
  10720. open-sourced&lt;/a&gt;.
  10721. By leveraging &lt;a href=&quot;https://lwn.net/Articles/636226/&quot;&gt;a cool new feature in the Linux kernel&lt;/a&gt;,
  10722. you can now query TrailDBs directly in &lt;a href=&quot;https://aws.amazon.com/s3/&quot;&gt;Amazon S3&lt;/a&gt; without
  10723. having to store the database locally, like here:&lt;/p&gt;
  10724.  
  10725. &lt;script type=&quot;text/javascript&quot; src=&quot;https://asciinema.org/a/2v7egtu98dxx60224iqqcfhx6.js&quot; id=&quot;asciicast-2v7egtu98dxx60224iqqcfhx6&quot; async=&quot;&quot; data-font-size=&quot;80%&quot;&gt;&lt;/script&gt;
  10726.  
  10727. &lt;h2 id=&quot;tens-of-trillions-of-events-in-amazon-s3&quot;&gt;Tens of Trillions of Events in Amazon S3&lt;/h2&gt;
  10728.  
  10729. &lt;p&gt;We have been very happy with &lt;a href=&quot;http://tech.adroll.com/blog/data/2015/09/22/data-pipelines-docker.html&quot;&gt;our data processing
  10730. pipeline&lt;/a&gt;,
  10731. built on top of Docker, Luigi, Spot Instances, and Amazon
  10732. S3. It allows us to process hundreds of terabytes of data stored in TrailDBs
  10733. cost-efficiently with low operational overhead.&lt;/p&gt;
  10734.  
  10735. &lt;p&gt;A typical job at AdRoll downloads a set of TrailDB shards from S3,
  10736. processes them, and pushes results back to S3. This model works fine
  10737. for batch jobs that need to query a large amount of data, similar to a
  10738. full table scan in RDBMS. Over time, our TrailDB shards have grown in
  10739. size (more rows) and they include a wider variety of information (more
  10740. columns). Consequently, an increasing number of queries need to access
  10741. only a subset of rows and columns.&lt;/p&gt;
  10742.  
  10743. &lt;p&gt;Having to download gigabytes of unneeded data to find the proverbial
  10744. needle in a haystack, expressed as a highly selective query, is quite
  10745. expensive. A more efficient solution would download only the specific
  10746. bytes that matter for the query. Traditional data warehouses address
  10747. this issue by carefully managing both storage and query optimization.
  10748. In contrast, our TrailDB-on-AWS architecture unbundles storage from
  10749. querying for ease of scalability and operation, similar to many other
  10750. big data systems.&lt;/p&gt;
  10751.  
  10752. &lt;p&gt;However, we can still leverage a traditional database technique to solve
  10753. the problem, namely &lt;a href=&quot;http://stackoverflow.com/questions/4401910/mysql-what-is-a-page&quot;&gt;page management&lt;/a&gt;.
  10754. Instead of downloading a full shard, we can fetch only the pages that are
  10755. required to execute the query.&lt;/p&gt;
  10756.  
  10757. &lt;h2 id=&quot;user-space-page-fault-handling&quot;&gt;User-Space Page Fault Handling&lt;/h2&gt;
  10758.  
  10759. &lt;p&gt;Cache and page managers are &lt;a href=&quot;https://madusudanan.com/blog/understanding-postgres-caching-in-depth&quot;&gt;very non-trivial subsystems of modern
  10760. databases&lt;/a&gt;.
  10761. The topic is an inherently complex one but it is
  10762. further complicated by &lt;a href=&quot;https://lwn.net/Articles/580542/&quot;&gt;the complex interplay between the
  10763. kernel and the database&lt;/a&gt;.
  10764. After all, the kernel has &lt;a href=&quot;https://www.kernel.org/doc/gorman/pdf/understand.pdf&quot;&gt;a very non-trivial subsystem of its
  10765. own&lt;/a&gt; for page
  10766. management that doesn’t always work in perfect harmony
  10767. with its database counterpart.&lt;/p&gt;
  10768.  
  10769. &lt;p&gt;It is a valid question whether the database even needs a virtual memory
  10770. system of its own, instead of just letting &lt;a href=&quot;http://varnish-cache.org/docs/trunk/phk/notes.html&quot;&gt;the kernel take care
  10771. of the job&lt;/a&gt;.
  10772. In practice, it is not easy to make the kernel-managed
  10773. paging perform as well as a custom subsystem given how
  10774. much &lt;a href=&quot;http://oldblog.antirez.com/post/what-is-wrong-with-2006-programming.html&quot;&gt;the database knows about the data and its access
  10775. patterns&lt;/a&gt;
  10776. in contrast to the kernel.&lt;/p&gt;
  10777.  
  10778. &lt;p&gt;Enter a very promising new feature in the Linux kernel, &lt;a href=&quot;https://lwn.net/Articles/636226/&quot;&gt;user-space page
  10779. fault handling&lt;/a&gt;. This feature allows
  10780. a user-space application, such as the TrailDB library, to handle page
  10781. faults, that is, a certain piece of data being missing from the process’
  10782. address space, as it best sees fit. In particular, this feature allows
  10783. us to fetch page-size (4KB) chunks of data from S3 with practically no
  10784. impact to the existing codebase.&lt;/p&gt;
  10785.  
  10786. &lt;p&gt;While user-space page fault handling might not be flexible enough
  10787. yet to replace Postgres’ venerable page management subsystem, it
  10788. allowed us to implement this key piece of functionality in TrailDB
  10789. in less than 2000 lines of code while working harmoniously
  10790. with the kernel. As demonstrated by this case, it is extremely
  10791. powerful to be able to embed &lt;a href=&quot;https://en.wikipedia.org/wiki/Separation_of_mechanism_and_policy&quot;&gt;application-specific policies in kernel
  10792. mechanisms&lt;/a&gt;
  10793. that take care of heavy lifting such as page fault handling, and, in other
  10794. cases, &lt;a href=&quot;https://github.com/libfuse/libfuse&quot;&gt;file systems&lt;/a&gt;,
  10795. and &lt;a href=&quot;http://lwn.net/Articles/603983&quot;&gt;packet processing&lt;/a&gt;.&lt;/p&gt;
  10796.  
  10797. &lt;p&gt;The API for user-space page fault handling is surprisingly
  10798. straightforward. If you are curious, you can try
  10799. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;userfaultfd()&lt;/code&gt; yourself by following this &lt;a href=&quot;http://noahdesu.github.io/2016/10/10/userfaultfd-hello-world.html&quot;&gt;hello world
  10800. tutorial&lt;/a&gt;.&lt;/p&gt;
  10801.  
  10802. &lt;h2 id=&quot;traildb-external-data-architecture&quot;&gt;TrailDB External Data Architecture&lt;/h2&gt;
  10803.  
  10804. &lt;p&gt;The solution implemented in TrailDB is fully generic and allows data to
  10805. be fetched from any source, not just Amazon S3. The block servers that
  10806. handle interfacing with the data sources are pluggable and implemented
  10807. outside the TrailDB library, so they can be conveniently implemented in
  10808. any programming language.&lt;/p&gt;
  10809.  
  10810. &lt;p&gt;Interfacing with S3 is implemented by a block server written in Go,
  10811. &lt;a href=&quot;https://github.com/tuulos/traildb-s3-server&quot;&gt;traildb-s3-server&lt;/a&gt;. In the
  10812. diagram below, this is the “server process.” The TrailDB application,
  10813. “client process,” communicates with the server using simple requests sent
  10814. over TCP.&lt;/p&gt;
  10815.  
  10816. &lt;p&gt;The purple box is an unmodified application using the TrailDB library.
  10817. The blue boxes are responsible for paging and caching. The yellow box is
  10818. an external data source, in our case Amazon S3.&lt;/p&gt;
  10819.  
  10820. &lt;p&gt;Let’s walk through a page fault situation, that is, what happens when TrailDB
  10821. tries to access a piece of external data like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s3://traildb.io/data/wikipedia-small.tdb&lt;/code&gt;
  10822. that is managed by the user-space page fault handler:&lt;/p&gt;
  10823.  
  10824. &lt;p&gt;&lt;img src=&quot;/images/post_images/s3mmap-arch.png&quot; style=&quot;display: block; width: 100%; margin-left: auto; margin-right: auto&quot; /&gt;&lt;/p&gt;
  10825.  
  10826. &lt;h4 id=&quot;1-page-fault-generated&quot;&gt;1. Page fault generated&lt;/h4&gt;
  10827.  
  10828. &lt;p&gt;If &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tdb_open()&lt;/code&gt; is called with a non-local URL, e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s3://&lt;/code&gt;, memory
  10829. regions that are normally memory-mapped with a file are mapped as
  10830. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MAP_ANONYMOUS&lt;/code&gt; and the region is registered to the user-space page
  10831. fault handler.&lt;/p&gt;
  10832.  
  10833. &lt;p&gt;When the region is accessed by TrailDB as usual, a page fault is
  10834. generated which blocks the app. Instead of the page fault being handled
  10835. by the kernel, it sends a message to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;userfaultfd&lt;/code&gt; file descriptor
  10836. indicating what address was requested.&lt;/p&gt;
  10837.  
  10838. &lt;p&gt;&lt;strong&gt;source&lt;/strong&gt; &lt;a href=&quot;https://github.com/tuulos/traildb/blob/userfault/src/tdb_external.c&quot;&gt;tdb_external.c&lt;/a&gt;&lt;/p&gt;
  10839.  
  10840. &lt;h4 id=&quot;2-user-space-page-fault-handler-triggered&quot;&gt;2. User-space page fault handler triggered&lt;/h4&gt;
  10841.  
  10842. &lt;p&gt;When &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tdb_open()&lt;/code&gt; was called with a non-local URL, also a separate thread
  10843. was launched to handle page faults. The thread uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;poll()&lt;/code&gt; to wait for
  10844. incoming messages in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;userfaultfd&lt;/code&gt; file descriptor.&lt;/p&gt;
  10845.  
  10846. &lt;p&gt;When a message is received, the virtual address requested is translated
  10847. to an offset in the TrailDB file. Next, the handler checks if the offset
  10848. corresponds to the latest block received from the server. If it does,
  10849. we already have the required data which can be copied back to the app.&lt;/p&gt;
  10850.  
  10851. &lt;p&gt;&lt;strong&gt;source&lt;/strong&gt; &lt;a href=&quot;https://github.com/tuulos/traildb/blob/userfault/src/tdb_external_pagefault.c&quot;&gt;tdb_external_pagefault.c&lt;/a&gt;&lt;/p&gt;
  10852.  
  10853. &lt;p&gt;If the offset doesn’t resolve to the latest block, we must request a
  10854. new block from the block server. A simple message is prepared which
  10855. contains the URL of the TrailDB, an offset, and the minimum number of
  10856. bytes requested, typically 4KB. This message is then sent over a TCP
  10857. connection to the server.&lt;/p&gt;
  10858.  
  10859. &lt;p&gt;&lt;strong&gt;source&lt;/strong&gt; &lt;a href=&quot;https://github.com/tuulos/traildb/blob/userfault/src/tdb_external_comm.c&quot;&gt;tdb_external_comm.c&lt;/a&gt;&lt;/p&gt;
  10860.  
  10861. &lt;h4 id=&quot;3-block-server-handles-the-request&quot;&gt;3. Block server handles the request&lt;/h4&gt;
  10862.  
  10863. &lt;p&gt;The block server is expected to return bytes at the requested offset.&lt;/p&gt;
  10864.  
  10865. &lt;p&gt;For efficiency reasons, our
  10866. &lt;a href=&quot;https://github.com/tuulos/traildb-s3-server&quot;&gt;traildb-s3-server&lt;/a&gt; caches
  10867. blocks received from S3 locally. When a request is received, first we
  10868. need to check if the block exists in the local cache, possibly as a
  10869. sub-range of a previous cached larger block.&lt;/p&gt;
  10870.  
  10871. &lt;p&gt;&lt;strong&gt;source&lt;/strong&gt; &lt;a href=&quot;https://github.com/tuulos/traildb-s3-server/blob/master/blockcache.go&quot;&gt;blockcache.go&lt;/a&gt;.&lt;/p&gt;
  10872.  
  10873. &lt;p&gt;If the block is found, the server sends a response that includes a
  10874. local path and an offset where the client can find the requested block.
  10875. Currently we assume that both the client and the server share a common
  10876. filesystem.&lt;/p&gt;
  10877.  
  10878. &lt;p&gt;&lt;strong&gt;source&lt;/strong&gt; &lt;a href=&quot;https://github.com/tuulos/traildb-s3-server/blob/master/s3tdb.go&quot;&gt;s3tdb.go&lt;/a&gt;.&lt;/p&gt;
  10879.  
  10880. &lt;h4 id=&quot;4-block-server-fetches-a-block-from-s3&quot;&gt;4. Block server fetches a block from S3&lt;/h4&gt;
  10881.  
  10882. &lt;p&gt;If the block is not found, we must fetch it from its original source,
  10883. in this case from Amazon S3. This is accomplished using the standard
  10884. &lt;a href=&quot;https://aws.amazon.com/sdk-for-go/&quot;&gt;AWS SDK for Go&lt;/a&gt;, which, besides
  10885. downloading data, handles authentication transparently, in our case
  10886. using an instance-specific IAM role.&lt;/p&gt;
  10887.  
  10888. &lt;p&gt;S3 supports standard HTTP range requests, which allows us to request
  10889. an exact range of bytes from S3. Since there is quite a high constant
  10890. overhead and a monetary price associated to each GET request in S3, it
  10891. makes sense to request a larger &lt;strong&gt;block&lt;/strong&gt; instead of the single 4KB
  10892. &lt;strong&gt;page&lt;/strong&gt; that was originally requested by the client. You can read more
  10893. about the effect of the block size below.&lt;/p&gt;
  10894.  
  10895. &lt;p&gt;After a successful download, the block is added to the cache, and a response
  10896. is sent to the client as in Step 3.&lt;/p&gt;
  10897.  
  10898. &lt;p&gt;&lt;strong&gt;source&lt;/strong&gt; &lt;a href=&quot;https://github.com/tuulos/traildb-s3-server/blob/master/s3tdb.go&quot;&gt;s3tdb.go&lt;/a&gt;.&lt;/p&gt;
  10899.  
  10900. &lt;h4 id=&quot;5-the-requested-block-is-memory-mapped-by-the-client&quot;&gt;5. The requested block is memory-mapped by the client&lt;/h4&gt;
  10901.  
  10902. &lt;p&gt;The page fault handler receives a path and an offset to the requested
  10903. block from the server. The block is memory mapped by the handler.&lt;/p&gt;
  10904.  
  10905. &lt;p&gt;&lt;strong&gt;source&lt;/strong&gt; &lt;a href=&quot;https://github.com/tuulos/traildb/blob/userfault/src/tdb_external_pagefault.c&quot;&gt;tdb_external_pagefault.c&lt;/a&gt;&lt;/p&gt;
  10906.  
  10907. &lt;h4 id=&quot;6-the-requested-page-is-delivered&quot;&gt;6. The requested page is delivered&lt;/h4&gt;
  10908.  
  10909. &lt;p&gt;The page fault handler thread uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;UDDFIO_COPY&lt;/code&gt; to copy the requested
  10910. page back to the app’s address space atomically. Once this call
  10911. finishes, the app thread unblocks and it proceeds to process data as
  10912. usual.&lt;/p&gt;
  10913.  
  10914. &lt;p&gt;&lt;strong&gt;source&lt;/strong&gt; &lt;a href=&quot;https://github.com/tuulos/traildb/blob/userfault/src/tdb_external_pagefault.c&quot;&gt;tdb_external_pagefault.c&lt;/a&gt;&lt;/p&gt;
  10915.  
  10916. &lt;h2 id=&quot;performance&quot;&gt;Performance&lt;/h2&gt;
  10917.  
  10918. &lt;p&gt;As described above, handling a single page fault may require up to six
  10919. steps which involve a separate thread, a TCP connection, a server,
  10920. and an HTTPS request to S3. Is this going to be too slow for realistic
  10921. workloads?&lt;/p&gt;
  10922.  
  10923. &lt;p&gt;Below we have benchmarked the S3-direct solution versus the baseline approach
  10924. that involves downloading the full TrailDB file first to a local disk. The
  10925. test data is &lt;a href=&quot;http://traildb.io/data/wikipedia-history-small.tdb&quot;&gt;a 100MB snapshot of Wikipedia edit history&lt;/a&gt;, as seen in the &lt;a href=&quot;http://traildb.io/docs/tutorial/&quot;&gt;TrailDB tutorial&lt;/a&gt;.
  10926. This TrailDB contains 6.3M trails. The benchmarks were performed on an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;r3.8xlarge&lt;/code&gt;
  10927. EC2 instance in the same region where the data is located in S3.&lt;/p&gt;
  10928.  
  10929. &lt;h4 id=&quot;sequential-access&quot;&gt;Sequential Access&lt;/h4&gt;
  10930.  
  10931. &lt;p&gt;First, consider a simple case where you want to access the first X trails
  10932. in the TrailDB. In the chart below the green line shows the constant
  10933. cost of the baseline: it takes about 1.6 seconds to download the full
  10934. TrailDB. After downloading it, there is practically no cost for accessing
  10935. the first 0-2000 trails.&lt;/p&gt;
  10936.  
  10937. &lt;p&gt;&lt;img src=&quot;/images/post_images/s3mmap-sequential.png&quot; style=&quot;display: block; width: 100%; margin-left: auto; margin-right: auto&quot; /&gt;&lt;/p&gt;
  10938.  
  10939. &lt;p&gt;The orange line shows the cost for performing the same operation using
  10940. S3 directly with a 16MB block size. Accessing the first 0-2000 trails
  10941. involves downloading a single 16MB block, which takes about 500ms.&lt;/p&gt;
  10942.  
  10943. &lt;p&gt;The blue line shows the cost for using S3 directly with a 16KB block
  10944. size. The small block size yields the lowest latency to the first
  10945. results: You can get results for the first 0-10 trails in about 100ms,
  10946. which is a 16x speedup compared to the baseline! However, the more
  10947. trails are processed the more small blocks need to be fetched. The
  10948. overhead adds up so that at about 200 trails it is cheaper to use
  10949. the baseline approach instead.&lt;/p&gt;
  10950.  
  10951. &lt;h4 id=&quot;random-access&quot;&gt;Random Access&lt;/h4&gt;
  10952.  
  10953. &lt;p&gt;It is expected that sequential access favors large block sizes. However,
  10954. the original needle in a haystack question is more about random access.&lt;/p&gt;
  10955.  
  10956. &lt;p&gt;Like above, the baseline approach first downloads the full TrailDB and
  10957. then access X trails randomly. Again, the cost to download dominates the
  10958. total cost since a small 100MB TrailDB fits in memory easily.&lt;/p&gt;
  10959.  
  10960. &lt;p&gt;&lt;img src=&quot;/images/post_images/s3mmap-random.png&quot; style=&quot;display: block; width: 100%; margin-left: auto; margin-right: auto&quot; /&gt;&lt;/p&gt;
  10961.  
  10962. &lt;p&gt;The break-even point is about 20-30 trails. If you need to access
  10963. fewer trails than this, it is faster to use S3 directly. Note that the
  10964. break-even point heavily depends on the size of the TrailDB. With many
  10965. of our production TrailDBs that can exceed a terabyte in size, the
  10966. break-even point is much higher.&lt;/p&gt;
  10967.  
  10968. &lt;p&gt;Another thing to note above is how close the orange 16MB line is to the
  10969. baseline. The overhead of S3 direct is about 10-15%.&lt;/p&gt;
  10970.  
  10971. &lt;h4 id=&quot;overhead-of-user-space-page-fault-handling&quot;&gt;Overhead of User-Space Page Fault Handling&lt;/h4&gt;
  10972.  
  10973. &lt;p&gt;If we extended the X axis above to cover all the 6.3M trails, all blocks
  10974. of the test TrailDB would end up being cached locally. After this, there
  10975. is no need to download anything from S3 and the difference between the
  10976. S3 direct approach and the baseline should equal to just the overhead of
  10977. user-space page fault handling.&lt;/p&gt;
  10978.  
  10979. &lt;p&gt;As mentioned in Step 2 above, if the latest block received contains
  10980. the data we need, we can skip the following Steps 2-5 which involve
  10981. a context switch to another process and sending a message over TCP,
  10982. amongst other things. Hence, the fewer blocks we need to request, the
  10983. less overhead there is.&lt;/p&gt;
  10984.  
  10985. &lt;p&gt;We evaluate this overhead of the fully cached case as a function of the
  10986. block size versus the blue baseline. The operation performed is a full scan over
  10987. all the trails:&lt;/p&gt;
  10988.  
  10989. &lt;p&gt;&lt;img src=&quot;/images/post_images/s3mmap-overhead.png&quot; style=&quot;display: block; width: 100%; margin-left: auto; margin-right: auto&quot; /&gt;&lt;/p&gt;
  10990.  
  10991. &lt;p&gt;With a small block size, 4-16KB, practically every page fault involves
  10992. all the steps, except S3 download in this case. With a larger block
  10993. size, the overhead reduces to 10-15% what we saw above. Interestingly,
  10994. when the block size approaches the size of the full DB, S3 direct
  10995. becomes faster than the baseline. This is caused by the extra overhead
  10996. of having to launch a separate process to download the data in the
  10997. baseline case.&lt;/p&gt;
  10998.  
  10999. &lt;p&gt;Overall, user-space page fault handling has a surprisingly low overhead
  11000. compared to page fault handling in the kernel. The 10-15% overhead shown
  11001. above is caused by our particular implementation, not &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;userfaultfd()&lt;/code&gt;
  11002. &lt;em&gt;per se&lt;/em&gt;. This could be optimized further if needed.&lt;/p&gt;
  11003.  
  11004. &lt;h4 id=&quot;s3-bandwidth&quot;&gt;S3 Bandwidth&lt;/h4&gt;
  11005.  
  11006. &lt;p&gt;Of course, fetching data from S3 is by far the most expensive part
  11007. of the process. We want to evaluate if our way of downloading data from
  11008. S3 performs well.&lt;/p&gt;
  11009.  
  11010. &lt;p&gt;We leverage the standard &lt;a href=&quot;https://aws.amazon.com/sdk-for-go/&quot;&gt;AWS SDK for
  11011. Go&lt;/a&gt; for downloading,
  11012. which should be reasonably performant. It is widely known
  11013. that &lt;a href=&quot;http://blog.zachbjornson.com/2015/12/29/cloud-storage-perfo
  11014. rmance.html&quot;&gt;aggregate S3 bandwidth increases with more concurrent
  11015. downloaders&lt;/a&gt;. The current implementation treats each &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tdb&lt;/code&gt; handle as
  11016. a separate thread both on the client and the server side. Hence, you
  11017. should be able to increase bandwidth by either opening multiple &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tdb&lt;/code&gt;
  11018. handles to a single TrailDB and sharding it by row, or by handling
  11019. multiple TrailDBs in parallel.&lt;/p&gt;
  11020.  
  11021. &lt;p&gt;The benchmark supports this hypothesis:&lt;/p&gt;
  11022.  
  11023. &lt;p&gt;&lt;img src=&quot;/images/post_images/s3mmap-bandwidth.png&quot; style=&quot;display: block; width: 100%; margin-left: auto; margin-right: auto&quot; /&gt;&lt;/p&gt;
  11024.  
  11025. &lt;p&gt;You can nearly saturate the S3 bandwidth on a single instance with
  11026. about 20 parallel handles. Based on our previous findings, the maximum
  11027. bandwidth achievable between a single EC2 instance and S3 is about
  11028. 4Gbit/sec. In the above case, we use a single SSD as a backing store so
  11029. the 3.5Gbit/sec peak may be bottlenecked by the SATA interface. In other
  11030. words, using S3 directly can provide you the same bandwidth as a local
  11031. SSD.&lt;/p&gt;
  11032.  
  11033. &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
  11034.  
  11035. &lt;p&gt;The benchmarks show that using S3 directly with user-space page fault
  11036. handling is an excellent solution to our original problem with highly
  11037. selective queries. We were positively surprised that the overhead
  11038. of this solution is low even for the full sequential
  11039. scan case, which suggests that all workloads could benefit from this
  11040. solution.&lt;/p&gt;
  11041.  
  11042. &lt;p&gt;Finding a needle in a haystack is a perfect exemplar for using S3
  11043. directly. It turns out this approach has a number of other benefits as
  11044. well:&lt;/p&gt;
  11045.  
  11046. &lt;ul&gt;
  11047.  &lt;li&gt;Infinitely lower time to first results, depending on the TrailDB size.
  11048. This is especially useful for ad hoc queries, which can be interrupted
  11049. if the first results are not satisfactory.&lt;/li&gt;
  11050.  &lt;li&gt;Makes it possible to process TrailDBs that are larger than local disk
  11051. space. This is a potential killer feature, since this allows you to
  11052. use instance types with little or no local disk space which in turn
  11053. means more opportunities for cost optimization.&lt;/li&gt;
  11054.  &lt;li&gt;The separate block server further clarifies the separation between
  11055. the storage and the query layer, opening up interesting new use cases.
  11056. The block server can be developed quickly outside of the core
  11057. TrailDB codebase.&lt;/li&gt;
  11058. &lt;/ul&gt;
  11059.  
  11060. &lt;p&gt;Future work includes dynamic block sizes, smarter prefetching and eviction
  11061. policies for the block server, and support for new data sources besides
  11062. Amazon S3. Contributions are welcome!&lt;/p&gt;
  11063.  
  11064. &lt;h4 id=&quot;give-it-a-try&quot;&gt;Give it a try!&lt;/h4&gt;
  11065.  
  11066. &lt;p&gt;Although this work is not merged to the TrailDB master branch as of writing
  11067. of this blog article, you should be able to give it a try easily:&lt;/p&gt;
  11068.  
  11069. &lt;ol&gt;
  11070.  &lt;li&gt;
  11071.    &lt;p&gt;Make sure that your kernel supports &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;userfaultfd&lt;/code&gt;. You need Linux kernel
  11072. 4.3 or newer with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;USERFAULTFD&lt;/code&gt; enabled. With many distributions, you can
  11073. check this by running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;grep USERFAULTFD /boot/config*&lt;/code&gt;
  11074. which should return &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CONFIG_USERFAULTFD=y&lt;/code&gt; if the feature is enabled in
  11075. your kernel. For instance, &lt;a href=&quot;https://cloud-images.ubuntu.com/locator/ec2/&quot;&gt;Ubuntu Yakkety Yak (16.10)&lt;/a&gt;
  11076. works out of the box.&lt;/p&gt;
  11077.  &lt;/li&gt;
  11078.  &lt;li&gt;
  11079.    &lt;p&gt;Git clone the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;userfault&lt;/code&gt; branch of TrailDB:
  11080. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git clone -b userfault https://github.com/tuulos/traildb&lt;/code&gt;
  11081. and follow the &lt;a href=&quot;http://traildb.io/docs/getting_started/#install-on-linux&quot;&gt;installation instructions&lt;/a&gt;.&lt;/p&gt;
  11082.  &lt;/li&gt;
  11083.  &lt;li&gt;
  11084.    &lt;p&gt;Git clone the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;traildb-s3-serve&lt;/code&gt;:
  11085. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git clone https://github.com/tuulos/traildb-s3-server&lt;/code&gt;
  11086. and run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;go build&lt;/code&gt; as usual.&lt;/p&gt;
  11087.  &lt;/li&gt;
  11088. &lt;/ol&gt;
  11089.  
  11090. &lt;p&gt;After this, you can start &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./traildb-s3-server&lt;/code&gt; and use e.g. the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tdb&lt;/code&gt; command line
  11091. tool to access data directly from S3:&lt;/p&gt;
  11092.  
  11093. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;tdb dump -i s3://my-bucket/my.tdb
  11094. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  11095.  
  11096. &lt;p&gt;If you have any trouble with installation or you have any other questions or feedback,
  11097. please join us at &lt;a href=&quot;https://gitter.im/traildb/traildb&quot;&gt;our TrailDB Gitter channel&lt;/a&gt;.&lt;/p&gt;
  11098.  
  11099. </description>
  11100.    </item>
  11101.    
  11102.    
  11103.    
  11104.    <item>
  11105.      <title>How to Create a Style Guide: Start with a UI Framework</title>
  11106.      <link>https://tech.nextroll.com/blog/product/2016/07/29/how-to-create-a-style-guide.html</link>
  11107.      <pubDate>Fri, 29 Jul 2016 00:00:00 -0700</pubDate>
  11108.      <author></author>
  11109.      <guid isPermaLink="false">https://tech.nextroll.com/blog/product/2016/07/29/how-to-create-a-style-guide</guid>
  11110.      <description>&lt;p&gt;&lt;small class=&quot;muted&quot;&gt;&lt;em&gt;Originally published on Medium on &lt;a href=&quot;https://medium.com/adroll-design/how-to-create-a-style-guide-start-with-a-ui-framework-2d7ea456fa33&quot;&gt;July 13, 2016&lt;/a&gt;.&lt;/em&gt;&lt;/small&gt;&lt;/p&gt;
  11111.  
  11112. &lt;p&gt;&lt;small&gt;&lt;em&gt;Update May 16th, 2017: Our style guide is now published at &lt;a href=&quot;http://ux.adroll.com&quot;&gt;ux.adroll.com&lt;/a&gt;. You can read more about the latest updates &lt;a href=&quot;https://medium.com/adroll-design/the-journey-to-design-consistency-ee26bef2fd88&quot;&gt;on our UX blog&lt;/a&gt;.&lt;/em&gt;&lt;/small&gt;&lt;/p&gt;
  11113.  
  11114. &lt;h2 id=&quot;qa-with-adrolls-ux-designer-on-why-we-did-it-and-what-we-learned&quot;&gt;Q&amp;amp;A with AdRoll’s UX Designer on why we did it and what we learned.&lt;/h2&gt;
  11115.  
  11116. &lt;blockquote&gt;
  11117. Hi! I’m Arya Srinivasan, a UX Researcher at AdRoll. I sat down with Mason Lee, a UX Designer working on AdRoll’s native ads API product, to talk about his work developing AdRoll’s style guide.
  11118. &lt;/blockquote&gt;
  11119.  
  11120. &lt;h4 id=&quot;to-kick-things-off-why-did-you-start-the-style-guide-project-what-was-the-problem-that-needed-solving&quot;&gt;To kick things off, why did you start the style guide project? What was the problem that needed solving?&lt;/h4&gt;
  11121.  
  11122. &lt;p&gt;&lt;strong&gt;Mason&lt;/strong&gt;: The problem was design inconsistency, both across products and within a single product. For example, a button that should look the same everywhere but actually varies in color, font weight, and border style.&lt;/p&gt;
  11123.  
  11124. &lt;center&gt;
  11125. &lt;img alt=&quot;Messy Buttons&quot; src=&quot;/images/post_images/ux-blog-post-old-buttons.png&quot; /&gt;
  11126. &lt;small class=&quot;caption&quot;&gt;
  11127. Buttons look different in individual designers’ mocks and our applications.
  11128. &lt;/small&gt;
  11129. &lt;/center&gt;
  11130.  
  11131. &lt;p&gt;AdRoll’s rapid growth meant that we were focused on speed. We’re now a larger company with multiple products, so as a designer I believe it’s important for us to emphasize consistency in how we present our products: through the design.&lt;/p&gt;
  11132.  
  11133. &lt;p&gt;To focus on design, we first needed to fix existing inconsistencies. UX designers here typically focus on one or two products, so in order to get our team to think about the design across all products, I set up a weekly “UI Smackdown” meeting to discuss UI guidelines.&lt;/p&gt;
  11134.  
  11135. &lt;p&gt;In each meeting, we looked at design inconsistencies to decide on a single design. After a few meetings, designers still asked me about the correct color or padding, etc. We needed a central document with all of the answers, so I built our UI Framework in Sketch as a resource for designers. Whenever we realize there’s a missing component or want to include a new component, we discuss it and add it to the master UI Framework file.&lt;/p&gt;
  11136.  
  11137. &lt;h4 id=&quot;you-mentioned-you-want-adroll-to-be-a-design-centric-companywhat-do-you-mean-by-that&quot;&gt;You mentioned you want AdRoll to be a design centric company — what do you mean by that?&lt;/h4&gt;
  11138.  
  11139. &lt;p&gt;&lt;strong&gt;Mason&lt;/strong&gt;: I want AdRoll to put design at the forefront so that it’s a competitive differentiator — recognized by customers as a well-designed product that really solves their needs.&lt;/p&gt;
  11140.  
  11141. &lt;h4 id=&quot;what-were-your-immediate-goals-for-the-style-guide-do-you-have-a-longer-term-vision-for-this-project&quot;&gt;What were your immediate goals for the style guide? Do you have a longer term vision for this project?&lt;/h4&gt;
  11142.  
  11143. &lt;p&gt;&lt;strong&gt;Mason&lt;/strong&gt;: My short-term goal is to have design consistency between designers by standardizing our UI components. I want designers to speak the same language when talking design. For example, in a modal or dropdown, engineers build based on how the designer suggests. If the design elements are different between designers, engineers are going to make the same element different ways.&lt;/p&gt;
  11144.  
  11145. &lt;p&gt;My mid-term goal is to have this style defined in our code in “RollUp,” AdRoll’s internal UI component library. If we have a predefined style sheet, all our engineers need to do is copy it. Designers and engineers can speak the same language.&lt;/p&gt;
  11146.  
  11147. &lt;h4 id=&quot;did-you-run-into-any-problems-while-creating-the-style-guide-how-did-you-solve-them&quot;&gt;Did you run into any problems while creating the style guide? How did you solve them?&lt;/h4&gt;
  11148.  
  11149. &lt;p&gt;&lt;strong&gt;Mason&lt;/strong&gt;: One of the biggest hurdles was getting the buy-in from people across product teams. To get everyone involved, I set up a meeting with a clear list of agenda items to cover. I presented design inconsistencies, such as varying dropdown menus between products. Providing visual evidence triggers conversations, and in the end, people care about their product and want consistency.&lt;/p&gt;
  11150.  
  11151. &lt;p&gt;Another challenge was deciding on the rules. When talking about standardizing a component, it should be applicable anywhere, in every context. You have to think of every edge case. The component has to be flexible, but at the same time feature-complete enough so it’s easily usable, consumable, and applicable.&lt;/p&gt;
  11152.  
  11153. &lt;center&gt;
  11154. &lt;img alt=&quot;Geotargeting UI&quot; src=&quot;/images/post_images/geo-gif.gif&quot; /&gt;
  11155. &lt;small class=&quot;caption&quot;&gt;
  11156. Here’s an example of our style guide’s flexibility. Our initial decision for the padding in this geotargeting dropdown was too big, so we revised the style guide to account for this use case.
  11157. &lt;/small&gt;
  11158. &lt;/center&gt;
  11159.  
  11160. &lt;center&gt;
  11161. &lt;img alt=&quot;Geotargeting UI Before and After&quot; src=&quot;/images/post_images/geo-ui-before-after.png&quot; /&gt;
  11162. &lt;small class=&quot;caption&quot;&gt;
  11163. Before (left), After (right)
  11164. &lt;/small&gt;
  11165. &lt;/center&gt;
  11166.  
  11167. &lt;p&gt;I actually want to call out one more challenge! Naming can be difficult. As I said before, I want designers and engineers to speak the same language, but this needs to be done carefully. For something as simple as a dropdown, we actually have several variations (one with checkboxes, another with checkboxes and a text block, and another is a standard dropdown menu). How do we name three different dropdowns so there’s a universal understanding of which is which?&lt;/p&gt;
  11168.  
  11169. &lt;center&gt;
  11170. &lt;img alt=&quot;AdRoll Dropdown Variations&quot; src=&quot;/images/post_images/dropdowns.png&quot; /&gt;
  11171. &lt;/center&gt;
  11172.  
  11173. &lt;p&gt;The semantics are challenging; our naming needs to makes sense. We used some cool collaboration tools to gain a consensus when we’re deciding on a name. For example, &lt;a href=&quot;https://wake.com/&quot;&gt;Wake&lt;/a&gt; helped us collect all open questions and issues, discuss solutions, monitor the UI Smackdown decisions, and continue the conversation with the larger product team through an integration with Slack.&lt;/p&gt;
  11174.  
  11175. &lt;center&gt;
  11176. &lt;img alt=&quot;Wake and Slack&quot; src=&quot;/images/post_images/wake-slack.png&quot; /&gt;
  11177. &lt;small class=&quot;caption&quot;&gt;
  11178. How we used Wake to discuss UI inconsistencies and collaborate on component rules.
  11179. &lt;/small&gt;
  11180. &lt;/center&gt;
  11181.  
  11182. &lt;h4 id=&quot;is-there-anything-unique-about-adrolls-ui-that-you-had-to-consider-when-creating-the-style-guide&quot;&gt;Is there anything unique about AdRoll’s UI, that you had to consider when creating the style guide?&lt;/h4&gt;
  11183.  
  11184. &lt;p&gt;&lt;strong&gt;Mason&lt;/strong&gt;: Our dashboard is very data-heavy. In addition, the campaign creation flow gives advertisers a bunch of levers to pull. In order to meet the needs of less-experienced advertisers, we aim to have effective default settings. In our products, the components have complex functions but look simple and are easy to use.&lt;/p&gt;
  11185.  
  11186. &lt;h4 id=&quot;are-there-some-things-you-wish-you-knew-when-you-started-creating-the-style-guide&quot;&gt;Are there some things you wish you knew when you started creating the style guide?&lt;/h4&gt;
  11187.  
  11188. &lt;p&gt;&lt;strong&gt;Mason&lt;/strong&gt;: I wish I had a deeper understanding of how all our products work from the start. For instance, we share how our respective product’s work in our weekly design critique meeting, so I know how SendRoll (our email retargeting product) works on the surface, but I don’t have the deep knowledge of SendRoll that it’s designer does. I think that nuanced understanding of the product definitely helps when working on the style guide, because I then have a better understanding of all of the potential use cases.&lt;/p&gt;
  11189.  
  11190. &lt;h4 id=&quot;so-whats-the-best-way-to-achieve-that-common-understanding-of-a-designers-process-and-their-product&quot;&gt;So what’s the best way to achieve that common understanding of a designer’s process and their product?&lt;/h4&gt;
  11191.  
  11192. &lt;p&gt;&lt;strong&gt;Mason&lt;/strong&gt;: Even though we’re really focused on shipping our products, we do a good job of sharing our design process in our weekly design critique meeting. I think we can be better about closing the loop after each meeting — how did the designer incorporate the feedback from the meeting? Once the product is shipped and used by our advertisers, we could also share how advertisers are using the products based on the insights from analytics.&lt;/p&gt;
  11193.  
  11194. &lt;h4 id=&quot;were-there-any-resources-you-referred-to-while-working-on-this-project&quot;&gt;Were there any resources you referred to while working on this project?&lt;/h4&gt;
  11195.  
  11196. &lt;p&gt;&lt;strong&gt;Mason&lt;/strong&gt;: I read Atomic Design by Brad Frost, researched online, and talked to other UX designers at MeetUps. If you think that a particular company practices good design, then it’s pretty likely they have talked about their style guide somewhere online.&lt;/p&gt;
  11197.  
  11198. &lt;h4 id=&quot;whats-the-status-of-our-style-guide&quot;&gt;What’s the status of our style guide?&lt;/h4&gt;
  11199.  
  11200. &lt;p&gt;&lt;strong&gt;Mason&lt;/strong&gt;: I’ve captured and revisited all the UI elements that we use in our different products and grouped them into foundations, components, patterns, and pages, which will serve as a source of truth for our UI designs.&lt;/p&gt;
  11201.  
  11202. &lt;p&gt;You can check out the foundation and component UI elements on &lt;a href=&quot;https://dribbble.com/shots/2833155-AdRoll-UI-Framework&quot;&gt;Dribbble&lt;/a&gt;. If you’re familiar with Atomic Design, I grouped the “atom” and “molecule” levels into what I call “components.” For example, combining the form title and the input makes it easy for other designers to easily copy a completed form field.&lt;/p&gt;
  11203.  
  11204. &lt;h3 id=&quot;thanks-for-reading&quot;&gt;Thanks for reading!&lt;/h3&gt;
  11205. &lt;p&gt;Look out for these upcoming topics as we develop our style guide:&lt;/p&gt;
  11206.  
  11207. &lt;ul&gt;
  11208.  &lt;li&gt;How a UI Framework simplifies collaboration&lt;/li&gt;
  11209.  &lt;li&gt;Developing new stylesheets based on the UI Framework&lt;/li&gt;
  11210.  &lt;li&gt;How to build a Style Guide website&lt;/li&gt;
  11211.  &lt;li&gt;The journey to finding our Voice and Tone&lt;/li&gt;
  11212. &lt;/ul&gt;
  11213. </description>
  11214.    </item>
  11215.    
  11216.    
  11217.    
  11218.    <item>
  11219.      <title>Netflix and The Art of Product Evolution</title>
  11220.      <link>https://tech.nextroll.com/blog/product/2016/07/05/netflix-art-product-evolution.html</link>
  11221.      <pubDate>Tue, 05 Jul 2016 00:00:00 -0700</pubDate>
  11222.      <author></author>
  11223.      <guid isPermaLink="false">https://tech.nextroll.com/blog/product/2016/07/05/netflix-art-product-evolution</guid>
  11224.      <description>&lt;p&gt;There is a &lt;a href=&quot;http://www.nytimes.com/2016/06/19/magazine/can-netflix-survive-in-the-new-world-it-created.html?_r=0&quot;&gt;terrific article&lt;/a&gt; in last week’s Sunday &lt;em&gt;Times&lt;/em&gt; outlining Netflix’s role in changing the way we consume media. The central question of the piece is whether Netflix will dominate the world it has created. Nimble competitors like Amazon, Hulu, and the major studios (once threatened by streaming) are each leveraging unique assets to disrupt legacy media models.&lt;/p&gt;
  11225.  
  11226. &lt;p&gt;From the article, we at AdRoll gleaned a broader framework for how products can evolve gracefully. Much as Netflix evolved from sending DVDs in the mail to streaming syndicated shows to producing original content, AdRoll is poised to transition from retargeting ads to bigger marketing solutions. With that leap comes a fair amount of risk. What next-generation products should we choose? How do we keep our most passionate customers engaged? How much do we continue to invest in the core? Here are five learnings from the Netflix story:&lt;/p&gt;
  11227.  
  11228. &lt;ol&gt;
  11229.  &lt;li&gt;
  11230.    &lt;p&gt;&lt;strong&gt;Innovate incrementally&lt;/strong&gt;. The Netflix of 1998—sending DVDs in the mail—was not in a position to start negotiating content deals. From a technical standpoint, Netflix lacked the major distribution networks. From a branding standpoint, Netflix was still analogous to Blockbuster. Streaming was the intermediate step that allowed Netflix to transform their brand and build the requisite technology infrastructure to take on the studios.&lt;/p&gt;
  11231.  
  11232.    &lt;p&gt;In a similar manner, AdRoll is a trusted name in retargeting. Building on our customer needs and their data assets led us to &lt;a href=&quot;https://www.adroll.com/product/prospecting&quot;&gt;Prospecting&lt;/a&gt; first and &lt;a href=&quot;https://www.adroll.com/product/sendroll&quot;&gt;SendRoll&lt;/a&gt; second. Billboards or TV commercials shouldn’t be next; that’s too far a leap. But web page optimization, CRM, mobile app retargeting, and marketing vendor management are not so far-fetched. With products that align to these areas, AdRoll can leverage our strengths while building new assets, such as a &lt;a href=&quot;https://blog.adroll.com/product/adrolls-cross-device-functionality-unifies-fragmented-customer-journeys&quot;&gt;cross-device graph&lt;/a&gt;. We can evolve our brand beyond display advertising. Most important, we can take our customers along with us slowly into areas that make intuitive sense for them.&lt;/p&gt;
  11233.  &lt;/li&gt;
  11234.  &lt;li&gt;
  11235.    &lt;p&gt;&lt;strong&gt;Understand your customers&lt;/strong&gt;. Investing $100M in &lt;em&gt;House of Cards&lt;/em&gt; was a data-based decision for Netflix. The show aligned nicely to the interests of Netflix viewers—political dramas, Kevin Spacey, and David Fincher’s directorial work. It was a calculated leap.&lt;/p&gt;
  11236.  
  11237.    &lt;p&gt;Analogously, AdRoll customers are keenly interested in attribution and ROI, optimizing their web pages, and managing their marketing vendors better. Our product portfolio iteration should adhere to these themes. We don’t need to build advanced functionality to gauge customer feedback. An MVP offering is enough to learn from our users what resonates most.&lt;/p&gt;
  11238.  &lt;/li&gt;
  11239.  &lt;li&gt;
  11240.    &lt;p&gt;&lt;strong&gt;Reallocate resources asymmetrically&lt;/strong&gt;. Here’s the tricky thing about those red envelopes in the mail (yes, Netflix still offers them): the once-core DVD business for Netflix is shedding subscribers quickly. But from a contribution margin perspective (contribution to profit), DVD rentals are a cash cow for Netflix. DVDs help them fund the streaming business where profit margins are tiny, eroded by content bidding and large upfront fees.&lt;/p&gt;
  11241.  
  11242.    &lt;p&gt;Our situation at AdRoll may be the reverse. Retargeting, prospecting, and email represent huge markets in which we still have plenty of room to grow. Six million customers advertise on Google Search; we only speak to 25,000 of those customers today. Hence, our resources will continue to focus on growing these product lines. AdRoll should continue to treat innovations as mini-startups within the company, arming them with more materials as their proofs of concept become stronger.&lt;/p&gt;
  11243.  &lt;/li&gt;
  11244.  &lt;li&gt;
  11245.    &lt;p&gt;&lt;strong&gt;Hiccups are OK if you confront them quickly&lt;/strong&gt;. When Netflix divorced their streaming and DVD services in 2011—making the two products more costly than before—there was an uproar. The stock dropped 45% in three weeks, so the CEO apologized and reversed course. Good will bounced back, and the stock is up 134% this year.&lt;/p&gt;
  11246.  
  11247.    &lt;p&gt;As AdRoll’s product suite becomes more complex, we will confront very real questions about how to differentiate and price the component offerings. Experiments may help us here to gauge market reaction before wider deployment. Intuitively, we should probably price and package our offering so that customers enjoy economies of scale. (The more products they buy with us, the cheaper each component becomes—basically the opposite of what Netflix did.) But we may not get everything right here initially, and we think that’s OK. Customer passion is a sign that they care.&lt;/p&gt;
  11248.  &lt;/li&gt;
  11249.  &lt;li&gt;
  11250.    &lt;p&gt;&lt;strong&gt;People trump everything else&lt;/strong&gt;. By now the Netflix &lt;a href=&quot;http://www.slideshare.net/reed2001/culture-1798664/61-Two_Types_of_Necessary_Rules1&quot;&gt;culture deck&lt;/a&gt; has become famous. The company screens actively against jerks, pays high for “fully-formed adults,” and demands high performance.&lt;/p&gt;
  11251.  
  11252.    &lt;p&gt;The good news is culture is paramount at AdRoll too. Like Netflix, we look for autonomous, responsible folks and throw tremendous responsibility their way. We believe the next SendRoll will start as the seed of an idea from a Roller’s customer conversation or market research.  That seed germinates in her brain and becomes an experiment. Later, her data look promising so she rallies a team, a sprint, a launch, an iteration.  That only happens when our Roller is so passionate that she drives the idea to fruition and relentlessly improves it. No lazy bones here. People are AdRoll’s biggest product superpower.&lt;/p&gt;
  11253.  &lt;/li&gt;
  11254. &lt;/ol&gt;
  11255. </description>
  11256.    </item>
  11257.    
  11258.    
  11259.    
  11260.    <item>
  11261.      <title>AdRoll and Diversity in Tech: A Q&amp;A with Truc Nguyen and Jessica Grist</title>
  11262.      <link>https://tech.nextroll.com/blog/culture/2016/06/20/code-screening-questions.html</link>
  11263.      <pubDate>Mon, 20 Jun 2016 00:00:00 -0700</pubDate>
  11264.      <author></author>
  11265.      <guid isPermaLink="false">https://tech.nextroll.com/blog/culture/2016/06/20/code-screening-questions</guid>
  11266.      <description>&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;
  11267.  
  11268. &lt;p&gt;&lt;em&gt;&lt;a href=&quot;http://www.codedocumentary.com&quot;&gt;CODE: Debugging the Gender Gap&lt;/a&gt;&lt;/em&gt; is a documentary that exposes the dearth of American female and minority software engineers and explores the reasons for these gaps. &lt;em&gt;CODE&lt;/em&gt; raises the question: “What would society gain from having more women and minorities code?”&lt;/p&gt;
  11269.  
  11270. &lt;p&gt;At AdRoll, we see the value and gains from having more women and minorities coding, which is why we screened this documentary at our headquarters as part of our wider Diversity and Inclusion initiatives. We followed the screening with a panel of female engineering managers—as well as some of our very own Rollers—to discuss not only how the gender gap came to be, but what we can do as individuals and as a collective to debug it.&lt;/p&gt;
  11271.  
  11272. &lt;p&gt;To reproduce some of what we discussed, we conducted a Q&amp;amp;A with &lt;a href=&quot;https://twitter.com/truc_&quot;&gt;Truc Nguyen&lt;/a&gt; and &lt;a href=&quot;https://twitter.com/thehackstress&quot;&gt;Jessica Grist&lt;/a&gt;, two of our Rollers who were panel speakers. Truc is a UX Designer on the SendRoll team; Jessica is a Software Engineer on the Retargeting team.&lt;/p&gt;
  11273.  
  11274. &lt;p&gt;Following the Q&amp;amp;A, you can read a bit more about some of our key learnings and takeaways for AdRoll.&lt;/p&gt;
  11275.  
  11276. &lt;h2 id=&quot;qa-with-our-adroll-mini-panel&quot;&gt;Q&amp;amp;A With Our AdRoll Mini-Panel&lt;/h2&gt;
  11277.  
  11278. &lt;h4 id=&quot;can-you-tell-us-more-about-your-experience-as-a-female-in-tech&quot;&gt;Can you tell us more about your experience as a female in tech?&lt;/h4&gt;
  11279.  
  11280. &lt;p&gt;&lt;strong&gt;Truc&lt;/strong&gt;: I discovered CS in a roundabout way through the social sciences first. Like others who discovered CS later in life, I think I had a narrow understanding of what it was. After taking the intro class, I loved how it introduced me to new ways of deconstructing and solving problems. I hear a lot of horror stories, but in my experience, I never felt like I was explicitly discriminated against as a female. Like Jessica, one way it’s affected me is that I delayed or dismissed considering it as a field of study until later in undergrad. I never thought it was for me because I bought into the perception of CS as a hardcore field you can only do if you’ve been coding since you were twelve and taking apart computers. Luckily, I had a female section leader and a handful of role models (both male and female) who encouraged me to keep pursuing it. As I progressed further, I did notice the gender ratio swaying in one direction.&lt;/p&gt;
  11281.  
  11282. &lt;p&gt;&lt;strong&gt;Jessica&lt;/strong&gt;: Refreshingly, it’s been pretty positive! I started off in a supportive and woman-centered environment at &lt;a href=&quot;https://hackbrightacademy.com/&quot;&gt;Hackbright Academy&lt;/a&gt;, where I was surrounded by incredible female role models, both the established engineers who taught the classes and the other women in my cohort. While my first job searches post-Hackbright were stressful, it seemed to be about par for the course from what I heard about other bootcamp grads’ experiences. At AdRoll, I’ve never felt marginalized or discriminated against because of my gender. I’ve also attended dozens of meetups in San Francisco and never felt unwelcome. However, the problem of underrepresentation of minorities in tech has had a huge impact on my professional life. If I had seen more female role models in tech growing up and in my college years, I probably would have become an engineer much sooner, rather than starting my professional life as a teacher.&lt;/p&gt;
  11283.  
  11284. &lt;h3 id=&quot;what-can-i-do-as-an-individual-to-support-underrepresented-people-in-tech&quot;&gt;What can I do as an individual to support underrepresented people in tech?&lt;/h3&gt;
  11285. &lt;p&gt;&lt;strong&gt;Truc&lt;/strong&gt;: The best thing you can do is treat them as you would any other friend or colleague who is worthy of your respect. Offer your encouragement, support, and praise because you admire them for their good work and their traits, not because they “need it more.”  If I never bring up my underrepresented status, don’t mention it unless it’s relevant to the conversation. Constantly reminding people they’re marginalized in tech could make them more self-conscious. If the issue comes up, be receptive and listen with an open mind and heart. And if you’re not marginalized, you can lend your voice to the conversation and raise awareness the next time the issue comes up.&lt;/p&gt;
  11286.  
  11287. &lt;p&gt;&lt;strong&gt;Jessica&lt;/strong&gt;: Treat them no differently than your other colleagues because in a professional setting, they aren’t different. They’re engineers (or designers, etc.) and so are you, and so is everyone else on your team. If you see anyone treating them differently just because they’re underrepresented, speak up about it. If they want to speak about their experiences as underrepresented people, listen. If they don’t, then don’t push them for it. Don’t expect them to be advocates for every other marginalized person in tech, but don’t be surprised if they want to be that either.&lt;/p&gt;
  11288.  
  11289. &lt;h3 id=&quot;what-can-my-company-do-to-support-underrepresented-people-in-tech&quot;&gt;What can my company do to support underrepresented people in tech?&lt;/h3&gt;
  11290.  
  11291. &lt;p&gt;&lt;strong&gt;Truc&lt;/strong&gt;: It’s two-fold.&lt;/p&gt;
  11292.  
  11293. &lt;p&gt;On the recruiting side, when you get a high volume of applications to comb over, you start to apply filters such as prestigious past experience, schooling, and referrals just to cut down on what you can review. I challenge companies to take the chance and effort to source candidates from different backgrounds and communities including non-traditional candidates. As the CS field can be perceived as male, nerdy, and insular from the outside, the way a company presents itself externally can affect the types of candidates it attracts. Qualified people may self-select or rule themselves out based on the company image or job descriptions that project unconscious bias. As an engineer, it’s really interesting that AdRoll recruiting is so willing to try out a data-driven approach to hiring and to use tools like &lt;a href=&quot;https://textio.com&quot;&gt;textio.com&lt;/a&gt;, which uses natural language processing to make sure your recruiting communications aren’t a turnoff for candidates.&lt;/p&gt;
  11294.  
  11295. &lt;p&gt;Once you’ve started to build a diverse workforce, it’s about creating an inclusive culture where people feel comfortable voicing their honest opinions without judgment. There’s no real playbook for that process; it has to happen somewhat naturally. As a leader, set a good example by showing that D&amp;amp;I is an important issue on your radar. Encourage town halls or breakout sessions where people can learn more and talk openly. This is something we have found to be effective at AdRoll and has been a catalyst for getting initiatives off the ground and into practice.&lt;/p&gt;
  11296.  
  11297. &lt;p&gt;&lt;strong&gt;Jessica&lt;/strong&gt;: It all starts with recruiting. Seek out recruiting events sponsored by organizations for minority engineers (or designers, or whatever your job req is for). Be open to non-traditional backgrounds. If you strictly limit your engineering workforce to people with four-year CS degrees from prestigious universities, your company is not fully supporting the inclusion of underrepresented minorities in tech. There’s no need to lower your skill standards; just don’t assume that people who haven’t gone to the “right” universities can’t be good engineers. Beyond hiring, get your leadership involved in D&amp;amp;I initiatives—and not just leaders from gender and ethnic minorities. Don’t assume that your female leaders, or your leaders from ethnic minorities, will want to be involved in D&amp;amp;I, though encourage them if they do.&lt;/p&gt;
  11298.  
  11299. &lt;h4 id=&quot;how-do-you-address-people-who-think-that-diversity-in-tech-isnt-an-issue&quot;&gt;How do you address people who think that diversity in tech isn’t an issue?&lt;/h4&gt;
  11300.  
  11301. &lt;p&gt;&lt;strong&gt;Truc&lt;/strong&gt;: I’ve heard some dissenting opinions: focusing on “women in tech” marginalizes women in tech; women aren’t interested in STEM fields, so we shouldn’t push them; and that the gender gap no longer exists. I challenge them to try to view the issue through a different lens. Though we’ve made great strides towards equity in the workplace over the past twenty, fifty, etc. years, there’s still a long way to go. As a member of an underrepresented group, it can sometimes feel like you’re a foreigner in another country going into a field, company, or organization where you don’t see much diversity. This lack of diversity in tech goes deeper than race, gender, or visible traits and is still a statistical truth. I try to appeal with &lt;a href=&quot;http://www.scientificamerican.com/article/how-diversity-makes-us-smarter/&quot;&gt;data&lt;/a&gt;, &lt;a href=&quot;http://web.mit.edu/cortiz/www/Diversity/Jayne%20and%20Dipboye%202004.pdf&quot;&gt;figures&lt;/a&gt;, and &lt;a href=&quot;http://www.mckinsey.com/business-functions/organization/our-insights/why-diversity-matters&quot;&gt;research&lt;/a&gt; that show creating a more inclusive workplace benefits everyone.&lt;/p&gt;
  11302.  
  11303. &lt;p&gt;&lt;strong&gt;Jessica&lt;/strong&gt;: This is a really tough one because when you bring up diversity in tech, and then are told that it’s not an issue, that puts you in the position of having to defend and prove your stance. Having to explain your own marginalization and exclusion is especially frustrating for members of marginalized groups. To be honest, I’m not sure of the best way to change these minds.&lt;/p&gt;
  11304.  
  11305. &lt;h4 id=&quot;what-are-the-best-ways-to-raise-awareness-and-combat-apathy&quot;&gt;What are the best ways to raise awareness and combat apathy?&lt;/h4&gt;
  11306.  
  11307. &lt;p&gt;&lt;strong&gt;Truc&lt;/strong&gt;: Heh, having attended many D&amp;amp;I events it can feel like we’re preaching to the choir. I honestly believe most people have good intentions and recognize that D&amp;amp;I is an issue but aren’t sure what actions they can take. If you’re not a manager or you don’t identify as underrepresented, it’s not clear how it relates to you as an individual. Being an ally to an underrepresented minority doesn’t necessarily have to take a lot of work. You don’t have to vocally advocate on behalf of a D&amp;amp;I movement (that’s asking a lot of anyone), but it does take conscious awareness. It means small things, like checking your bias, being more empathetic by trying to imagine what another’s experience is like, or just listening.&lt;/p&gt;
  11308.  
  11309. &lt;p&gt;&lt;strong&gt;Jessica&lt;/strong&gt;: The only answer I have to this is persistence. Apathy is an enormous part of the problem, perhaps even bigger than active opposition to D&amp;amp;I. The majority of people in tech will acknowledge that diversity in tech is a problem, but not that it’s their problem. But if it isn’t their problem, then whose is it? Underrepresented minorities already struggle enough to break into tech; it’s not automatically their responsibility to increase awareness of diversity too. So have your leaders talk about D&amp;amp;I. Have fun D&amp;amp;I events that will appeal to everyone. Make it a consistent and persistent part of your company’s culture.&lt;/p&gt;
  11310.  
  11311. &lt;h2 id=&quot;what-is-adroll-doing-to-address-the-gender-gap&quot;&gt;What is AdRoll doing to address the gender gap?&lt;/h2&gt;
  11312.  
  11313. &lt;p&gt;We don’t have all the answers at AdRoll, but people are making the right efforts and having open conversations. Some of the key takeaways we’ve gotten from our &lt;em&gt;CODE&lt;/em&gt; screening are to:&lt;/p&gt;
  11314.  
  11315. &lt;ul&gt;
  11316.  &lt;li&gt;Be aware of unconscious bias as individuals and as a company. We’ve rolled out unconscious bias training globally to generate more awareness about our own biases.&lt;/li&gt;
  11317.  &lt;li&gt;Build a network of allies (colleagues, managers, etc.) to help advocate or raise issues for underrepresented groups in your company.&lt;/li&gt;
  11318.  &lt;li&gt;Be action-oriented: speaking up when you see something you disagree with, working with teams, managers, and recruiting to make a difference through events, training, and facilitated discussions.&lt;/li&gt;
  11319.  &lt;li&gt;Be engaged, not only internally, but also externally. Partner with other companies and organizations and learn from each other, as well as open up Diversity and Inclusion work to an external audience.&lt;/li&gt;
  11320. &lt;/ul&gt;
  11321.  
  11322. &lt;p&gt;We thoroughly enjoyed opening up this conversation to a broader audience at AdRoll and look forward to future events that continue the discussion and drive momentum forward. If you want to contribute to this discussion, then be sure to come Roll With Us as we hire and build strong teams. You can &lt;a href=&quot;https://www.adroll.com/about/careers/open-positions&quot;&gt;learn more at our career site&lt;/a&gt;.&lt;/p&gt;
  11323. </description>
  11324.    </item>
  11325.    
  11326.    
  11327.    
  11328.    <item>
  11329.      <title>Announcing TrailDB - An Efficient Library for Storing and Processing Event Data</title>
  11330.      <link>https://tech.nextroll.com/blog/data/2016/05/24/traildb-open-sourced.html</link>
  11331.      <pubDate>Tue, 24 May 2016 00:00:00 -0700</pubDate>
  11332.      <author></author>
  11333.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2016/05/24/traildb-open-sourced</guid>
  11334.      <description>&lt;p&gt;&lt;a href=&quot;http://traildb.io&quot;&gt;&lt;img src=&quot;/images/post_images/traildb_logo.png&quot; style=&quot;display: block; width: 40%; margin-left: auto; margin-right: auto&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
  11335.  
  11336. &lt;p&gt;&lt;em&gt;tl;dr&lt;/em&gt; Today, we are open-sourcing TrailDB, a core library powering AdRoll. TrailDB makes it fast and fun to handle event data. Find it at &lt;a href=&quot;http://traildb.io&quot;&gt;traildb.io&lt;/a&gt;.&lt;/p&gt;
  11337.  
  11338. &lt;h2 id=&quot;problem-event-data&quot;&gt;Problem: Event Data&lt;/h2&gt;
  11339.  
  11340. &lt;p&gt;Imagine that you have a large amount of event data that looks like this:&lt;/p&gt;
  11341.  
  11342. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;2016-05-02T22:48:38 user023 view features
  11343. 2016-05-02T22:49:01 user301 click graph
  11344. 2016-05-02T23:03:02 user023 view pricing
  11345. 2016-05-02T23:15:45 user187 submit signup
  11346. 2016-05-02T23:35:23 user521 view about
  11347. 2016-05-02T23:58:11 user004 click graph
  11348. 2016-05-03T00:02:09 user023 submit signup
  11349. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  11350.  
  11351. &lt;p&gt;Data like these could be generated by any web application: each event
  11352. has a timestamp, a user identifier, a token describing the action, and
  11353. its target.&lt;/p&gt;
  11354.  
  11355. &lt;p&gt;One could store the events in a relational database, in a table &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;events&lt;/code&gt;
  11356. with the above four columns. The table makes it easy to query aggregate
  11357. statistics, like signups by day:&lt;/p&gt;
  11358.  
  11359. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot; data-lang=&quot;sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;date_trunc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;day&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11360.    &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;events&lt;/span&gt;
  11361.    &lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;signup&apos;&lt;/span&gt;
  11362.    &lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  11363.  
  11364. &lt;p&gt;Aggregate queries like this are the bread and butter of data
  11365. analytics. There are many excellent, scalable databases such
  11366. as &lt;a href=&quot;http://aws.amazon.com/redshift&quot;&gt;Amazon Redshift&lt;/a&gt; or
  11367. &lt;a href=&quot;http://prestodb.io&quot;&gt;Presto&lt;/a&gt; which can handle such queries this with
  11368. ease.&lt;/p&gt;
  11369.  
  11370. &lt;h4 id=&quot;user-level-analytics&quot;&gt;User-level Analytics&lt;/h4&gt;
  11371.  
  11372. &lt;p&gt;There is another class of queries that are considerably harder to express
  11373. in SQL. Notice how in the example data above, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;user023&lt;/code&gt; first comes to the
  11374. site, views a page about product features, then views a page about pricing,
  11375. and finally signs up after about an hour since the first event.&lt;/p&gt;
  11376.  
  11377. &lt;p&gt;Let’s say you wanted to count the number of users
  11378. who sign up in less than an hour since the first
  11379. event. One way of doing this in SQL is to use &lt;a href=&quot;http://www.postgresql.org/docs/current/static/functions-windo
  11380. w.html&quot;&gt;window
  11381. functions&lt;/a&gt;. Window functions are a powerful feature of
  11382. SQL but unfortunately &lt;a href=&quot;http://tapoueh.org/blog/2013/08/20-Window-Functions&quot;&gt;they can be quite hard to
  11383. grasp&lt;/a&gt;.&lt;/p&gt;
  11384.  
  11385. &lt;p&gt;Over the past years at AdRoll, we have seen a growing number of use
  11386. cases for user-level analytics. The use cases vary from computing
  11387. metrics like bounce rate to fraud detection and feature extraction for
  11388. machine learning. Theoretically, all these use cases could be solved
  11389. with sufficiently advanced SQL queries but it is hardly practical.
  11390. Alternatively, the queries could be expressed as Hadoop or Spark jobs.&lt;/p&gt;
  11391.  
  11392. &lt;p&gt;None of these solutions felt sufficiently flexible for all use cases.
  11393. Relational databases are not well-suited for computationally demanding
  11394. jobs: none of the existing relational databases can scale automatically
  11395. and smoothly based on the query load. Frameworks like Hadoop or Spark
  11396. impose their own constraints on how to express algorithms, and what
  11397. languages, libraries, and infrastructure can be used.&lt;/p&gt;
  11398.  
  11399. &lt;h2 id=&quot;solution-traildb&quot;&gt;Solution: TrailDB&lt;/h2&gt;
  11400.  
  11401. &lt;p&gt;TrailDB is a library implemented in C, which is optimized for storing
  11402. and querying series of events at the user level. Its &lt;a href=&quot;http://traildb.io/docs/technical_overview/#data-model&quot;&gt;data model&lt;/a&gt; is specifically designed for use cases like those described above:&lt;/p&gt;
  11403.  
  11404. &lt;p&gt;&lt;img src=&quot;/images/post_images/traildb_datamodel.png&quot; style=&quot;display: block; width: 80%; margin-left: auto; margin-right: auto&quot; /&gt;&lt;/p&gt;
  11405.  
  11406. &lt;p&gt;TrailDB’s secret sauce is &lt;a href=&quot;http://traildb.io/docs/technical_overview/#internals&quot;&gt;data
  11407. compression&lt;/a&gt;. It
  11408. leverages predictability of time-based data to compress your data to a
  11409. fraction of their original sizes. In contrast to traditional compression,
  11410. you can query the encoded data directly, decompressing only the parts
  11411. you need.&lt;/p&gt;
  11412.  
  11413. &lt;p&gt;TrailDB provides &lt;a href=&quot;http://traildb.io/docs/api&quot;&gt;a straightforward API&lt;/a&gt;
  11414. for querying events by user efficiently. The API is
  11415. thoroughly performance-driven: Not only does it allow you to
  11416. query events quickly, but it also &lt;a href=&quot;http://traildb.io/docs/technical_overview/#performance-best-practices&quot;&gt;encourages programming
  11417. patterns&lt;/a&gt;
  11418. that make the surrounding application fast.&lt;/p&gt;
  11419.  
  11420. &lt;h4 id=&quot;designed-for-the-cloud&quot;&gt;Designed for the Cloud&lt;/h4&gt;
  11421.  
  11422. &lt;p&gt;TrailDBs are immutable files which your application can create and
  11423. access using the TrailDB library. This is a deliberate design choice:
  11424. compressed, immutable files like these are a perfect match for modern,
  11425. distributed, elastic cloud environments. You can store TrailDBs as you
  11426. see fit; we find that Amazon S3 provides virtually unlimited throughput
  11427. for your data, and you can process them with an auto-scaling cluster of
  11428. (spot) instances which provides virtually unlimited computing power.&lt;/p&gt;
  11429.  
  11430. &lt;p&gt;TrailDB by itself is a simple library, decoupled from infrastructure,
  11431. which is designed to do one thing well: compress event data and
  11432. provide high-performance access to them. It integrates seamlessly to
  11433. elastic data pipelines as described in a previous blog article,
  11434. &lt;a href=&quot;http://tech.adroll.com/blog/data/2015/09/22/data-pipelines-do
  11435. cker.html&quot;&gt;Petabyte-Scale Data Pipelines with Docker, Luigi and Elastic Spot
  11436. Instances&lt;/a&gt;.&lt;/p&gt;
  11437.  
  11438. &lt;p&gt;All together, our elastic stack for processing event data in AWS looks
  11439. like this:&lt;/p&gt;
  11440.  
  11441. &lt;div style=&quot;margin-top: 5px; font-size: 80%; border-radius: 5px; margin-left: 5em; margin-right: 5em; text-align: center; color: #333; border: 1px solid #ccc; padding: 5px; background: #eee&quot;&gt;
  11442.    Job scheduler, &lt;a href=&quot;http://tech.adroll.com/blog/data/2015/10/15/luigi.html&quot;&gt;Quentin&lt;/a&gt;, executes jobs in an auto-scaling cluster
  11443. &lt;/div&gt;
  11444. &lt;div style=&quot;margin-top: 5px; font-size: 80%; border-radius: 5px; margin-left: 5em; margin-right: 5em; text-align: center; color: #333; border: 1px solid #ccc; padding: 5px; background: #eee&quot;&gt;
  11445.    &lt;a href=&quot;http://github.com/spotify/luigi&quot;&gt;Luigi&lt;/a&gt; manages dependencies between jobs
  11446. &lt;/div&gt;
  11447. &lt;div style=&quot;margin-top: 5px; font-size: 80%; border-radius: 5px; margin-left: 5em; margin-right: 5em; text-align: center; color: #333; border: 1px solid #ccc; padding: 5px; background: #eee&quot;&gt;
  11448.    Language-agnostic jobs in &lt;a href=&quot;http://docker.io&quot;&gt;Docker containers&lt;/a&gt; query TrailDBs
  11449. &lt;/div&gt;
  11450. &lt;div style=&quot;margin-top: 5px; margin-bottom: 2em; font-size: 80%; border-radius: 5px; margin-left: 5em; margin-right: 5em; text-align: center; color: #333; border: 1px solid #ccc; padding: 5px; background: #eee&quot;&gt;
  11451.    Event data are stored in &lt;a href=&quot;http://traildb.io&quot;&gt;TrailDBs&lt;/a&gt; in S3
  11452. &lt;/div&gt;
  11453.  
  11454. &lt;p&gt;This stack allows us to perform a massive amount of computation
  11455. cost-efficiently without any performance bottlenecks. S3 has proven
  11456. to be able to provide amazing aggregate throughput to hundreds of
  11457. concurrent instances, so TrailDBs can be accessed quickly.&lt;/p&gt;
  11458.  
  11459. &lt;p&gt;Thanks to TrailDB’s various language bindings, we can use the
  11460. best tool for each job. Some computationally intensive
  11461. jobs are written in &lt;a href=&quot;http://github.com/traildb/traildb&quot;&gt;C&lt;/a&gt;
  11462. or &lt;a href=&quot;http://github.com/traildb/traildb-d&quot;&gt;D&lt;/a&gt;. Some jobs
  11463. rely on external libraries which are convenient to access
  11464. in &lt;a href=&quot;http://github.com/traildb/traildb-python&quot;&gt;Python&lt;/a&gt; or
  11465. &lt;a href=&quot;http://github.com/traildb/traildb-r&quot;&gt;R&lt;/a&gt;. Naturally not all
  11466. use cases are batch jobs; we have services written
  11467. in &lt;a href=&quot;http://github.com/traildb/traildb-go&quot;&gt;Go&lt;/a&gt; and even
  11468. &lt;a href=&quot;http://github.com/traildb/traildb-haskell&quot;&gt;Haskell&lt;/a&gt; which access
  11469. TrailDBs.&lt;/p&gt;
  11470.  
  11471. &lt;p&gt;As TrailDB is a library, not a framework, it allows you to structure
  11472. your program in the most idiomatic way, which is great for productivity.
  11473. Docker removes deployment headaches by encapsulating everything in
  11474. well-behaved containers.&lt;/p&gt;
  11475.  
  11476. &lt;p&gt;Thanks to Quentin and auto-scaling groups, we can optimize instance
  11477. types for each workload, some of which are IO-bound while others require
  11478. more CPU. Finally, Luigi makes a job graph of tens of inter-dependent
  11479. jobs manageable.&lt;/p&gt;
  11480.  
  11481. &lt;h2 id=&quot;traildb-at-adroll&quot;&gt;TrailDB at AdRoll&lt;/h2&gt;
  11482.  
  11483. &lt;p&gt;&lt;img src=&quot;/images/post_images/traildb_breadcrumbs.png&quot; style=&quot;display: block; width: 80%; margin-left: auto; margin-right: auto&quot; /&gt;&lt;/p&gt;
  11484.  
  11485. &lt;p&gt;TrailDB has been used in production at AdRoll for about one and a half
  11486. years. During this time, we have stored tens of trillions of events in
  11487. TrailDBs. Today, the TrailDBs are queried by almost a thousand jobs
  11488. daily.&lt;/p&gt;
  11489.  
  11490. &lt;p&gt;As TrailDB is a core component of our data infrastructure, we take
  11491. robustness and testing seriously. The unit test coverage of TrailDB is
  11492. nearly 100%. TrailDB is also hardened by a number of integration tests.
  11493. As a result of long-term production use, we take backwards compatibility
  11494. very seriously: not a single time during the history of TrailDB we have
  11495. broken backwards compatibility so that old TrailDBs could not be read
  11496. anymore. We intend to keep it this way.&lt;/p&gt;
  11497.  
  11498. &lt;p&gt;We have a number of tools built on top of TrailDB which make computing
  11499. various user-level metrics easier. We are planning to open-source some
  11500. of these tools in the future. Adroll continues active development of
  11501. TrailDB and we hope that the wider community finds it useful as well.&lt;/p&gt;
  11502.  
  11503. &lt;h2 id=&quot;learn-more&quot;&gt;Learn More&lt;/h2&gt;
  11504.  
  11505. &lt;p&gt;You can find TrailDB with extensive documentation at &lt;a href=&quot;http://traildb.io&quot;&gt;traildb.io&lt;/a&gt;.&lt;/p&gt;
  11506.  
  11507. &lt;p&gt;The quickest way to get started is to read the &lt;a href=&quot;http://traildb.io/docs/getting_started/&quot;&gt;installation instructions&lt;/a&gt; and walk through &lt;a href=&quot;http://traildb.io/docs/tutorial&quot;&gt;the TrailDB tutorial&lt;/a&gt;.&lt;/p&gt;
  11508.  
  11509. &lt;p&gt;If you want to learn more about the background and internals of TrailDB, you can &lt;a href=&quot;https://www.youtube.com/watch?v=ondmDAMWEtg&quot;&gt;watch
  11510. a presentation about TrailDB on YouTube&lt;/a&gt; and
  11511. browse &lt;a href=&quot;http://slides.com/villetuulos/intro-to-traildb#/&quot;&gt;the slides of the presentation here&lt;/a&gt;.&lt;/p&gt;
  11512.  
  11513. &lt;p&gt;If you have any questions, you can &lt;a href=&quot;https://gitter.im/traildb/traildb&quot;&gt;find us in the TrailDB Gitter channel&lt;/a&gt;.&lt;/p&gt;
  11514. </description>
  11515.    </item>
  11516.    
  11517.    
  11518.    
  11519.    <item>
  11520.      <title>Collider: A New Frontier in Testing</title>
  11521.      <link>https://tech.nextroll.com/blog/rtb/2016/04/29/collider.html</link>
  11522.      <pubDate>Fri, 29 Apr 2016 00:00:00 -0700</pubDate>
  11523.      <author></author>
  11524.      <guid isPermaLink="false">https://tech.nextroll.com/blog/rtb/2016/04/29/collider</guid>
  11525.      <description>&lt;h4 id=&quot;introduction&quot;&gt;Introduction&lt;/h4&gt;
  11526.  
  11527. &lt;p&gt;Advertising is a competitive market. This probably won’t be the first time you hear something like that, but it’s absolutely true. For every move you make, you have to consider your competitors are getting a step closer to you. Speed and precision are key to the success of the business, but how can we guarantee these with the stress of such a large system? Is that pricing strategy better than what we are running right now? Would we improve the performance of our bids, or just burn money? Handling billions of requests may make changing a simple digit an odyssey that can end extremely badly, costing a significant chunk of revenue.&lt;/p&gt;
  11528.  
  11529. &lt;p&gt;Considering the need and the risk, there is a single solution: iterating fast, but in a controlled manner. Risking all of our revenue by deploying an idea would get, in the best case scenario, our business intelligence team to raise an eyebrow. So why don’t we risk just a fraction of our traffic? Let’s construct an experiment, specify our study group, and fire away. No complicated deployments, no significant risk of the whole system catching on fire, and immediate feedback on how that experiment is performing.&lt;/p&gt;
  11530.  
  11531. &lt;p&gt;Regardless of how cumbersome and slow a regular deployment process is, that’s something to worry about at a later time. Now is the time to do some good, old-fashioned trial and error, but in a healthy, controlled manner.&lt;/p&gt;
  11532.  
  11533. &lt;h4 id=&quot;collider&quot;&gt;Collider&lt;/h4&gt;
  11534.  
  11535. &lt;p&gt;How can we iterate in that swift, controlled manner while taking less than 10 minutes to deploy a new set of tests? Easy: everything comes back to Erlang.&lt;/p&gt;
  11536.  
  11537. &lt;p&gt;If you are unfamiliar with &lt;a href=&quot;http://learnyousomeerlang.com&quot;&gt;Erlang&lt;/a&gt;, you are missing out on a magnificent language with myriad smart features. This functional language was developed with a simple idea in mind: being able to process thousands of parallel requests in a concurrent manner. It was a perfect fit for our real-time bidders. These bidders handle hundreds of thousands of requests per second and have to respond on rigid deadlines, so concurrent and lightweight processes are a must. We’ll focus on a couple smart features: isolation and “hot swap.”&lt;/p&gt;
  11538.  
  11539. &lt;p&gt;Our experiment framework is called &lt;strong&gt;Collider&lt;/strong&gt;. As we mentioned, we want to be able to run these experiments in as much isolation as possible. This framework splits the total traffic received by a single box into a set of slots. Do we think that an experiment would need a big sample for proving its efficacy? We assign a big percentage of our traffic to that experiment, but we rest assured that the untouched slots are completely safe and unaffected by any crazy ideas. Erlang guarantees isolation on each process, so even if we deploy a wild piece of code, the system will ensure only that fraction of traffic is affected. Also, as we are controlling strict deadlines internally (as we have timeout requirements for third parties), even if we try to connect to external components, the requests will be expired and cleaned before they can affect the rest of the system.&lt;/p&gt;
  11540.  
  11541. &lt;center&gt;
  11542. &lt;img alt=&quot;Collider Traffic Flow&quot; src=&quot;/images/post_images/collider.png&quot; /&gt;
  11543. &lt;/center&gt;
  11544.  
  11545. &lt;p&gt;And what about that “hot swap” thing? Erlang allows us to hot reload code without any complicated ritual. We simply replace the previous version of our code on the filesystem and, poof!, the code will seamlessly be replaced for the whole runtime. It’s as easy as that. Instead of thinking about bringing down boxes and waiting for a new version to propagate, we simply pack our Collider experiment bundle and fire it off to S3. In no time at all, the boxes will be running the new set of tests and expiring those that are no longer needed. Zero impact, zero downtime, immediate success.&lt;/p&gt;
  11546.  
  11547. &lt;h4 id=&quot;integrations&quot;&gt;Integrations&lt;/h4&gt;
  11548.  
  11549. &lt;p&gt;“Okay, that sounds interesting, but what’s an ‘experiment?’ What can this Collider thing test?” This is one of the best aspects of Collider, as the infrastructure is prepared for handling all types of experiments.&lt;/p&gt;
  11550.  
  11551. &lt;p&gt;You may have heard about real-time bidding that it’s not only about being fast, it’s also about being smart about where to spend money. “How probable is it that this person will click on this ad? Do we need to go all out or will our money be wasted on this potential impression?” Having to reply to these kinds of questions without the appropriate data is a gamble, and that’s why the bidders rely on our BidIQ machine learning system.&lt;/p&gt;
  11552.  
  11553. &lt;p&gt;BidIQ trains on the data we have seen so far, extracting information from select features of our traffic. It then uses these features to return a performance prediction and bid price. But when we want to add new training features to that system, it’s not always easy to see which new criteria may be good indicators about the quality of a bid request. For these scenarios, Collider is a perfect fit, allowing us to do an A/B test in a controlled manner with live traffic. Once the data science peeps are happy with the results, that new model can be moved into production, becoming the default model that predicts prices for all of our traffic.&lt;/p&gt;
  11554.  
  11555. &lt;p&gt;But our integrations don’t end here. For example, when several ads are available to display to the same user, what criteria should we use to decide which ad is best? Or, is this real traffic or fraud? All these variables take place in different parts of our real-time bid flow, but the Collider framework is sufficiently generalized to inject itself into all of these places. We can change the behavior of key parts of our system in a transparent way, and for a small percentage of traffic in a completely isolated way. In order to keep our fingers on the system’s pulse, all results can be monitored on different dashboards with our analytics tools.&lt;/p&gt;
  11556.  
  11557. &lt;p&gt;Even better, when defining the slots for splitting our traffic, they can be reused, allowing us to combine different experimental behaviors across the system. Essentially, we can control whether we’re testing independent or correlated variables. For example, say we have a new, more aggressive BidIQ version and we want to combine it with a new internal auction strategy to select from all available ads. Reusing the same slots on both experiments would achieve the desired effect. In the image below, we can see how distributing slots appropriately will not affect our tests. The comparison of red, green, and white in the top layer is fine because each of those is subjected to proportionally the same amount of blue traffic. Similarly, on the second layer, the blue traffic is subjected to the same amount of red, green, and white traffic from the top layer as the white traffic in its layer.&lt;/p&gt;
  11558.  
  11559. &lt;center&gt;
  11560. &lt;img alt=&quot;Collider Correlation Control&quot; src=&quot;/images/post_images/collider-mix.png&quot; /&gt;
  11561. &lt;/center&gt;
  11562.  
  11563. &lt;h4 id=&quot;can-it-get-any-better&quot;&gt;Can it get any better?&lt;/h4&gt;
  11564.  
  11565. &lt;p&gt;Even with a swift deployment process, flexibility, and monitoring, we still have plenty of exciting features to implement on Collider and its environment.&lt;/p&gt;
  11566.  
  11567. &lt;p&gt;Right now, testing locally can take a while, especially when tests get more complicated or require specific external components available on production. Most of the time would be spent setting everything up to replicate production in an accurate manner (and with high odds of something missing). One of our short-term goals is automatically creating staging environments to push a release of Collider and let it run with synthetic traffic. In this replica of a production environment, the new version would run for a time until a set of health indicators are achieved. The release would then be ready to be shot into production (with human intervention, of course), across the board.&lt;/p&gt;
  11568.  
  11569. &lt;p&gt;This may look like a contradiction with a tight development cycle, but it allows for less tech-savvy parts of the company to create tests without requiring engineering time to debug them. For that reason, we would like to provide new, simple interfaces to set parameters on tests so anyone with the right permissions can fiddle with them.&lt;/p&gt;
  11570.  
  11571. &lt;p&gt;If working in such a fast-paced environment sounds enticing and you are in love with new and exciting technologies for high traffic volumes, don’t think twice: &lt;a href=&quot;https://www.adroll.com/about/careers&quot;&gt;drop us a line!&lt;/a&gt;&lt;/p&gt;
  11572. </description>
  11573.    </item>
  11574.    
  11575.    
  11576.    
  11577.    <item>
  11578.      <title>The New AdRoll Data Analysis Platform</title>
  11579.      <link>https://tech.nextroll.com/blog/data/2016/04/01/nada.html</link>
  11580.      <pubDate>Fri, 01 Apr 2016 00:00:00 -0700</pubDate>
  11581.      <author></author>
  11582.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2016/04/01/nada</guid>
  11583.      <description>
  11584. &lt;h4 id=&quot;introduction&quot;&gt;Introduction&lt;/h4&gt;
  11585.  
  11586. &lt;p&gt;If there is one thing that has been made abundantly clear over the last decade in tech,
  11587. it’s that understanding your data is paramount to achieving your goals. Understandably, many
  11588. startups have sprung up promising to aid every company and their employees with data analytics
  11589. to fully optimize not only their work lives, but their personal lives too. Knowledge is power.&lt;/p&gt;
  11590.  
  11591. &lt;p&gt;Of course, the more power you have, the more knowledge you can gain. After analyzing current market offerings, we on AdRoll Data Science
  11592. found only solutions that sluggishly accelerated our growth to the &lt;a href=&quot;https://en.wikipedia.org/wiki/Technological_singularity&quot;&gt;Singularity™&lt;/a&gt;.
  11593. Some like to say they welcome our new robot overlords, but we have a policy of actively ushering them in. (Note: We
  11594. will soon be adding a lovable automaton, &lt;a href=&quot;https://en.wikipedia.org/wiki/Gort_(The_Day_the_Earth_Stood_Still)&quot;&gt;Gort&lt;/a&gt;,
  11595. to our collection of &lt;a href=&quot;https://www.adroll.com/about/careers&quot;&gt;spirit animals&lt;/a&gt;.) This is why we set to work and
  11596. developed the New AdRoll Data Analytics (NADA) platform.&lt;/p&gt;
  11597.  
  11598. &lt;center&gt;
  11599. &lt;img alt=&quot;Gort in action.&quot; src=&quot;/images/post_images/gort.jpg&quot; /&gt;&lt;br /&gt;
  11600. &lt;i&gt;Gort in action.&lt;/i&gt;
  11601. &lt;/center&gt;
  11602.  
  11603. &lt;h4 id=&quot;technology&quot;&gt;Technology&lt;/h4&gt;
  11604.  
  11605. &lt;p&gt;As some readers of our previous blog posts may know, AdRoll strongly encourages its engineers to explore new
  11606. technologies, and if you’ve been following the Data Science Engineering team in particular, you know that we
  11607. love to get to the metal to eke out as much performance as we can.&lt;/p&gt;
  11608.  
  11609. &lt;p&gt;But this isn’t enough. AdRoll also has a long-standing philosophy of bringing the best to everyone, not just
  11610. those with the time and resources of enterprise businesses. We knew that NADA had to be fast, powerful, and most of all, keep
  11611. us close to our data. But this stuff is only great if the platform is usable, so we knew we wanted a nice,
  11612. clean API. Take a look for yourself at this k-means code:&lt;/p&gt;
  11613.  
  11614. &lt;p&gt;I don’t know about you, but this is just about the clearest code I’ve ever seen. Yes,
  11615. we at AdRoll have become huge fans of &lt;a href=&quot;http://compsoc.dur.ac.uk/whitespace/tutorial.html&quot;&gt;Whitespace&lt;/a&gt;.&lt;/p&gt;
  11616.  
  11617. &lt;p&gt;This may seem like a controversial choice to some, but consider the benefits:&lt;/p&gt;
  11618.  
  11619. &lt;ul&gt;
  11620.  &lt;li&gt;You can code for longer as eye strain is minimized, and with &lt;a href=&quot;https://upload.wikimedia.org/wikipedia/commons/9/91/Sun_three_button_mouse.jpg&quot;&gt;appropriate hardware&lt;/a&gt;, the ergonomics are fantastic.&lt;/li&gt;
  11621.  &lt;li&gt;Rather than slogging through PRs, each diff is a colorful delight that warms the soul. See image below.&lt;/li&gt;
  11622.  &lt;li&gt;In a process of discovery, printing out your code gives no benefit to your legal opposition.&lt;/li&gt;
  11623.  &lt;li&gt;By inputting your data as essentially raw binary, you’re thinking much more like a machine. This makes Gort happy.&lt;/li&gt;
  11624.  &lt;li&gt;You want Gort to be happy.&lt;/li&gt;
  11625. &lt;/ul&gt;
  11626.  
  11627. &lt;p&gt;These are just the benefits of Whitespace in and of itself. Let’s move on to NADA.&lt;/p&gt;
  11628.  
  11629. &lt;center&gt;
  11630. &lt;img alt=&quot;Beautiful diff.&quot; src=&quot;/images/post_images/ws_diff.png&quot; /&gt;&lt;br /&gt;
  11631. &lt;i&gt;Beautiful diff.&lt;/i&gt;
  11632. &lt;/center&gt;
  11633.  
  11634. &lt;h4 id=&quot;nada-features&quot;&gt;NADA Features&lt;/h4&gt;
  11635.  
  11636. &lt;p&gt;The core features of NADA are a boon to any data scientist, statistician, or Swiss army personnel.
  11637. You can regress your convex optimizations, trick your kernels into matrix factorizations, and
  11638. backpropagate your support vector machinations. All of this is so abstracted away, you can be sure
  11639. you won’t miss the random forest for the decision stumps.&lt;/p&gt;
  11640.  
  11641. &lt;p&gt;For example, take a look at this snippet, which convolutes a neural network into a stochastic principal
  11642. component Lp space:&lt;/p&gt;
  11643.  
  11644. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-c&quot; data-lang=&quot;c&quot;&gt;&lt;span class=&quot;n&quot;&gt;start_with_your_neural_net_to_convolute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;    
  11645.      
  11646.        
  11647.      
  11648.          
  11649.    
  11650.        
  11651.      
  11652. &lt;span class=&quot;n&quot;&gt;stochasticizing&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;          
  11653.    
  11654.    
  11655.      
  11656.      
  11657.          
  11658.      
  11659.      
  11660.      
  11661.      
  11662.      
  11663.      
  11664.      
  11665.      
  11666.      
  11667.      
  11668.          
  11669.    
  11670. &lt;span class=&quot;n&quot;&gt;now_for_components&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;      
  11671.          
  11672.    
  11673.    
  11674.    
  11675.    
  11676.          
  11677.      
  11678.      
  11679. &lt;span class=&quot;n&quot;&gt;output_Lp_space&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;      
  11680.      
  11681.      
  11682.        
  11683.  
  11684.  
  11685.  
  11686. &lt;span class=&quot;n&quot;&gt;repeat_for_recurrent_space&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  11687.  
  11688. &lt;p&gt;So much processing in such a small snippet really shows the expressive power of NADA.
  11689. This expressivity has allowed AdRoll to bootstrap our data science efforts faster
  11690. than we ever thought possible, because NADA is also incredibly efficient and scalable.&lt;/p&gt;
  11691.  
  11692. &lt;p&gt;NADA’s underlying data framework is a parallelized, fault-tolerant, NoSQL, relational
  11693. data structure that guarantees 100% consistency. This plays a key part in our ability
  11694. to rip through tons more data than we have in the past. It’s a unique structure, for sure,
  11695. so there is some ramp up to take full advantage of its features. However, we went ahead
  11696. and implemented more standard APIs to get your feet wet.&lt;/p&gt;
  11697.  
  11698. &lt;p&gt;For example, for our &lt;a href=&quot;https://www.adroll.com/product/prospecting&quot;&gt;Prospecting product&lt;/a&gt; we
  11699. were asked to find out whether certain subsets of our data contained graphs of a particular
  11700. &lt;a href=&quot;https://en.wikipedia.org/wiki/Clique_problem&quot;&gt;size&lt;/a&gt;. Tackling this somewhat naïvely, we have:&lt;/p&gt;
  11701.  
  11702. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-c&quot; data-lang=&quot;c&quot;&gt;&lt;span class=&quot;n&quot;&gt;good&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;      
  11703.          
  11704.      
  11705.          
  11706.      
  11707.        
  11708.      
  11709.        
  11710.  
  11711.  
  11712.  
  11713. &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;but_slow&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  11714.  
  11715. &lt;p&gt;It does all right even with the standard API calls, but that’s mainly a testament
  11716. to NADA’s computational ingenuity. However, as soon as we switch to NADA’s unique
  11717. brand of data structure:&lt;/p&gt;
  11718.  
  11719. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-c&quot; data-lang=&quot;c&quot;&gt;      
  11720.          
  11721.      
  11722.          
  11723.      
  11724.        
  11725.        
  11726.  
  11727.  
  11728.  
  11729. &lt;span class=&quot;n&quot;&gt;Wow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  11730.  
  11731. &lt;p&gt;“&lt;a href=&quot;https://en.wikipedia.org/wiki/Wow!_signal&quot;&gt;Wow,&lt;/a&gt;” indeed! Even a cursory glance
  11732. reveals the speedup here. We’re expecting a &lt;a href=&quot;http://www.claymath.org/millennium-problems/p-vs-np-problem&quot;&gt;novelty check&lt;/a&gt;
  11733. in the mail soon.&lt;/p&gt;
  11734.  
  11735. &lt;h4 id=&quot;onwards-and-upwards&quot;&gt;Onwards and Upwards!&lt;/h4&gt;
  11736.  
  11737. &lt;p&gt;We’ve only just scratched the surface of NADA here. While NADA is primarily
  11738. a data analytics framework, we have also found it useful in a variety of other
  11739. contexts, such as automatic intrusion detection, improving our WiFi signals,
  11740. supplanting the human race, and defragging our Hadoop clusters. Also, NADA is
  11741. absolutely delightful spread on a &lt;a href=&quot;http://www.businessinsider.com/we-tried-the-fancy-4-toast-san-francisco-is-going-crazy-for-2015-6&quot;&gt;thick piece of toast&lt;/a&gt;
  11742. with a glass of orange juice.&lt;/p&gt;
  11743.  
  11744. &lt;p&gt;So where do we go from here? Naturally, we’ll be open-sourcing NADA shortly.
  11745. Gort is looking forward to seeing what you can do with our framework
  11746. as soon as you get your hands on it. Gort does not like to be disappointed,
  11747. so please, contribute as much NADA as you can.&lt;/p&gt;
  11748.  
  11749. &lt;p&gt;Happy coding!&lt;/p&gt;
  11750.  
  11751. &lt;h4 id=&quot;ps&quot;&gt;P.S.&lt;/h4&gt;
  11752.  
  11753. </description>
  11754.    </item>
  11755.    
  11756.    
  11757.    
  11758.    <item>
  11759.      <title>Bidding Credit Systems
  11760. </title>
  11761.      <link>https://tech.nextroll.com/blog/dev/2016/02/28/bidding-credits.html</link>
  11762.      <pubDate>Sun, 28 Feb 2016 00:00:00 -0800</pubDate>
  11763.      <author></author>
  11764.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2016/02/28/bidding-credits</guid>
  11765.      <description>&lt;p&gt;The project I work on at AdRoll is mature. In a mature project the economics of
  11766. software flip around to where doing something right is more desirable that doing
  11767. something quickly. In 2015 we invested a lot of effort in resolving long-tail
  11768. issues with the bidder system, issues we’d recognized in our failure-mode
  11769. conversations but had left unresolved pending future work. Sometimes, a system
  11770. will evolve away from such things and sometimes you’ll have to put in the time
  11771. and energy to, not repair, but mitigate. Long-tail issues are often intrinsic,
  11772. those fun little sideshows Charles Perrow called Normal Accidents, failures
  11773. inherent to a system that can’t be removed but must be lived with. It’s good
  11774. times.&lt;/p&gt;
  11775.  
  11776. &lt;p&gt;This blog post details one of my favorite resolutions in 2015, the bidding
  11777. credit system.&lt;/p&gt;
  11778.  
  11779. &lt;p&gt;A brief run-down of the bidders. A bid comes in from an “exchange”–these are
  11780. companies like Google, AppNexus or OpenX–and we’re obligated to reply. Replies
  11781. can be either a “no bid”, essentially a bid of $0, or a “bid”, a bid of greater
  11782. than $0. If we win the auction the exchange sends us a specially coded
  11783. additional request, called a “win”. When this arrives, we reduce the amount of
  11784. money the system as a whole has to spend. If a “win” doesn’t arrive, we assume
  11785. that there’s still money to spend and the system does its best to do so.&lt;/p&gt;
  11786.  
  11787. &lt;p&gt;Here’s a fun question: what happens if an actual win gets decoupled from “win”
  11788. events? By protocol contract with the exchanges they’re not supposed to be but,
  11789. you know, things go wrong. What happens?&lt;/p&gt;
  11790.  
  11791. &lt;p&gt;The bidder used to simply assume, without a “win” that it had money to spend and
  11792. would continue to do so. This meant that it would continue to spend money, money
  11793. it may, were the “wins” arriving appropriately, have not spent. Worse, the
  11794. bidders might erroneously spend more money that in its budget, a worst-case
  11795. scenario we call “overspend”. Overspends are potentially catastrophic as there’s
  11796. no fixed upper bound.&lt;/p&gt;
  11797.  
  11798. &lt;p&gt;Win decoupling took us by surprise in early 2015. Alarms fired alerting us that
  11799. the system’s rate of win had dropped below normal bounds but a side-channel in a
  11800. more delayed notification informed us that the system was still spending money.
  11801. Oops! Manual intervention killed all bidding and communication with the exchange
  11802. in question informed us that a bug in their latest release caused a failure of
  11803. win notifications. This was… not cheap.&lt;/p&gt;
  11804.  
  11805. &lt;p&gt;In the lifetime of the system this decoupling had only been seen just the once
  11806. but it was bad enough to need a safety system to stop it from happening again.
  11807. My colleague
  11808. &lt;a href=&quot;http://tech.adroll.com/blog/dev/2015/11/16/count-things-with-aws-lambda-python-and-dynamodb.html&quot;&gt;Mike&lt;/a&gt;
  11809. and I kicked around ideas and settled on a take off a classic feedback control
  11810. system.&lt;/p&gt;
  11811.  
  11812. &lt;p&gt;The algorithm is this:&lt;/p&gt;
  11813.  
  11814. &lt;p&gt;Grant a “bidding credit pool” of some initial capacity C, maximally bounded at
  11815. Cm, some bid rate Br (per second) and some win rate Wr (per second):&lt;/p&gt;
  11816.  
  11817. &lt;ul&gt;
  11818.  &lt;li&gt;Each bid issued by the system to an exchange decrements C by B.&lt;/li&gt;
  11819.  &lt;li&gt;Each win received by the system from an exchange increases C by W.&lt;/li&gt;
  11820.  &lt;li&gt;If at bid time C =&amp;lt; L a no-bid is automatically issued.&lt;/li&gt;
  11821.  &lt;li&gt;Every X seconds Bmp credits are added to the pool.&lt;/li&gt;
  11822. &lt;/ul&gt;
  11823.  
  11824. &lt;p&gt;The first two points decide for the system how much a losing bid costs. Combine
  11825. that with the third point, it’s possible to set values such that a losing streak
  11826. will cause the pool to exhaust, shutting off bidding. The fourth point protects
  11827. against losing streaks &lt;em&gt;and&lt;/em&gt; allows for automatic recovery in the event of pool
  11828. exhaustion.&lt;/p&gt;
  11829.  
  11830. &lt;p&gt;It seemed to Mike and me that the algorithm would work. Say you have Cm = 100, C
  11831. = 100, Br = 100, Wr = 0.1, B = 1, W = 1, L = 0, X = 5 and Bmp = 100. Then you’ll
  11832. find that per second Br*Wr = 10 and so every second you’ll remove 90 credits
  11833. from C, or, C(t1) = C(t0) - Br(1-Wr).&lt;/p&gt;
  11834.  
  11835. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;C(0) = 100
  11836. C(1) = 100 - 100(1-0.1) = 90
  11837. C(2) = 90 - 100(1-0.1) = 0
  11838. C(3) = 0
  11839. C(4) = 0
  11840. C(5) = 0 + 100 - 100(1-0.1) = 90 = C(0).
  11841. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  11842.  
  11843. &lt;p&gt;It’s a loop. In reality there’d be some variation as Br won’t always be a
  11844. constant but notice that for 3/5 of the time the system is in a state where it
  11845. can’t bid. That’s not good: the whole purpose of the system is to make bids.&lt;/p&gt;
  11846.  
  11847. &lt;p&gt;Charles Perrow’s analysis in
  11848. &lt;a href=&quot;http://www.huffingtonpost.com/brian-troutwine/peculiar-books-reviewed-c_b_5244093.html&quot;&gt;Normal Accidents&lt;/a&gt;
  11849. argued that by attempting to resolve failure in one subsystem with the addition
  11850. of another you introduce new failures elsewhere. Well, check it out, the bidding
  11851. credit system can be configured to cause a &lt;em&gt;worse&lt;/em&gt; problem than the one we’re
  11852. attempting to fix.&lt;/p&gt;
  11853.  
  11854. &lt;p&gt;Clearly, the trick here is finding the right parameters for the system. By team
  11855. consensus we wanted something that would be slow to shut off bidding, something
  11856. with a gradual wind-down, and quick to start bidding again, something with “get
  11857. up and go”. This bidding credit system is dangerous enough that it can’t be
  11858. released into the world behind a feature gate and fiddled with but we &lt;em&gt;also&lt;/em&gt;
  11859. must have operational experience with it.&lt;/p&gt;
  11860.  
  11861. &lt;p&gt;What to do?&lt;/p&gt;
  11862.  
  11863. &lt;p&gt;We modeled it! What we had on our hands was a simple-to-understand system and a
  11864. search space for ideal configuration parameters. A small bit of python:&lt;/p&gt;
  11865.  
  11866. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;argparse&lt;/span&gt;
  11867. &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;math&lt;/span&gt;
  11868. &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;random&lt;/span&gt;
  11869. &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sys&lt;/span&gt;
  11870.  
  11871. &lt;span class=&quot;n&quot;&gt;ONE_DAY_MS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;86400&lt;/span&gt;
  11872.  
  11873. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;tri&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  11874.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;triangular&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  11875.                             &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11876.                             &lt;span class=&quot;nb&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  11877.  
  11878. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  11879.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  11880.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
  11881.    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  11882.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11883.  
  11884. &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;World&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  11885.    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11886.                 &lt;span class=&quot;n&quot;&gt;seed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11887.                 &lt;span class=&quot;n&quot;&gt;initial_credits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11888.                 &lt;span class=&quot;n&quot;&gt;win_kill_point&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;43200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11889.                 &lt;span class=&quot;n&quot;&gt;win_recover_point&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;53200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11890.                 &lt;span class=&quot;n&quot;&gt;credit_cap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11891.                 &lt;span class=&quot;n&quot;&gt;time_step&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11892.                 &lt;span class=&quot;n&quot;&gt;time_end_lower_limit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;86400&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11893.                 &lt;span class=&quot;n&quot;&gt;bid_request_rate_per_second&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11894.                 &lt;span class=&quot;n&quot;&gt;win_rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11895.                 &lt;span class=&quot;n&quot;&gt;bid_rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11896.                 &lt;span class=&quot;n&quot;&gt;decriment_per_bid&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11897.                 &lt;span class=&quot;n&quot;&gt;incriment_per_win&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11898.                 &lt;span class=&quot;n&quot;&gt;periodic_credit_bump&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11899.                 &lt;span class=&quot;n&quot;&gt;periodic_credit_bump_amount&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;
  11900.    &lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  11901.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time_step&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time_step&lt;/span&gt;
  11902.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time_end_lower_limit&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time_end_lower_limit&lt;/span&gt;
  11903.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bid_request_rate_per_second&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bid_request_rate_per_second&lt;/span&gt;
  11904.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;win_rate&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;win_rate&lt;/span&gt;
  11905.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bid_rate&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bid_rate&lt;/span&gt;
  11906.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;decriment_per_bid&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;decriment_per_bid&lt;/span&gt;
  11907.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;incriment_per_win&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;incriment_per_win&lt;/span&gt;
  11908.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;periodic_credit_bump&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;periodic_credit_bump&lt;/span&gt;
  11909.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;periodic_credit_bump_amount&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;periodic_credit_bump_amount&lt;/span&gt;
  11910.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credit_cap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;credit_cap&lt;/span&gt;
  11911.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;win_kill_point&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;win_kill_point&lt;/span&gt;
  11912.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;win_recover_point&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;win_recover_point&lt;/span&gt;
  11913.  
  11914.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
  11915.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;outbound_bids&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
  11916.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;initial_credits&lt;/span&gt;
  11917.  
  11918.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_bids&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
  11919.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_wins&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
  11920.  
  11921.        &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11922.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;
  11923.  
  11924.    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;step&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  11925.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time_end_lower_limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  11926.            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;
  11927.  
  11928.        &lt;span class=&quot;n&quot;&gt;instantaneous_max_bids&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bid_request_rate_per_second&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time_step&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11929.        &lt;span class=&quot;n&quot;&gt;time_of_day_adjustment&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ONE_DAY_MS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.0&lt;/span&gt;
  11930.        &lt;span class=&quot;n&quot;&gt;instantaneous_bids&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instantaneous_max_bids&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time_of_day_adjustment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11931.  
  11932.        &lt;span class=&quot;c1&quot;&gt;# Determine how many bids we can make given our credit pool and the
  11933. &lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;# bids coming into the system. Can&apos;t bid more than we get traffic!
  11934. &lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;free_bids&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ceil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;decriment_per_bid&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  11935.        &lt;span class=&quot;n&quot;&gt;bids_submitted&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ceil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instantaneous_bids&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;free_bids&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tri&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bid_rate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  11936.  
  11937.        &lt;span class=&quot;c1&quot;&gt;# Based on the outbound bids, determine how many of those signal back as
  11938. &lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;# wins. (We fake the delay to be one tick. Not realistic but I don&apos;t
  11939. &lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;# think it matters.)
  11940. &lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;#
  11941. &lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;# If we&apos;re past the kill point, we have no wins coming into the system.
  11942. &lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;win_kill_point&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;win_recover_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  11943.            &lt;span class=&quot;n&quot;&gt;dead_air&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
  11944.            &lt;span class=&quot;n&quot;&gt;instantaneous_wins&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
  11945.        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  11946.            &lt;span class=&quot;n&quot;&gt;dead_air&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
  11947.            &lt;span class=&quot;n&quot;&gt;instantaneous_wins&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;math&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ceil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;outbound_bids&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tri&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;win_rate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  11948.        &lt;span class=&quot;c1&quot;&gt;# Queue up our current bids to be outbound for the next tick.
  11949. &lt;/span&gt;        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;outbound_bids&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bids_submitted&lt;/span&gt;
  11950.  
  11951.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_wins&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instantaneous_wins&lt;/span&gt;
  11952.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_bids&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bids_submitted&lt;/span&gt;
  11953.  
  11954.        &lt;span class=&quot;c1&quot;&gt;# Credit adjustments
  11955. &lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;#
  11956. &lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;# Remove credits per bid. I don&apos;t _think_ order of this matters in the
  11957. &lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;# context of simulation, whether we incriment or decriment first.
  11958. &lt;/span&gt;        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bids_submitted&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;decriment_per_bid&lt;/span&gt;
  11959.        &lt;span class=&quot;c1&quot;&gt;# Account for wins. We done good and deserve some credit.
  11960. &lt;/span&gt;        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instantaneous_wins&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;incriment_per_win&lt;/span&gt;
  11961.        &lt;span class=&quot;c1&quot;&gt;# Every fixed period we get an automatic credit bump.
  11962. &lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;periodic_credit_bump&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  11963.            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;periodic_credit_bump_amount&lt;/span&gt;
  11964.  
  11965.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credit_cap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  11966.            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credit_cap&lt;/span&gt;
  11967.  
  11968.        &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credit_cap&lt;/span&gt;
  11969.  
  11970.        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time_step&lt;/span&gt;
  11971.  
  11972.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;free_bids&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bids_submitted&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;outbound_bids&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instantaneous_wins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instantaneous_bids&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time_of_day_adjustment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_bids&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_wins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dead_air&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  11973.  
  11974. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;__main__&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  11975.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argparse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ArgumentParser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;description&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Simulate the &quot;bidding credits&quot; constrol system&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11976.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--time_step&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11977.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer, unit in seconds: time simulation steps forward per tick&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11978.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--time_end_lower_limit&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;86400&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11979.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer, unit in seconds: time simulation will not cease before; default 1 day&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11980.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--win_kill_point&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;86400&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  11981.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer, unit in seconds: point in time when all win notifications will fail; default 1/2 day&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11982.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--win_recover_point&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;86400&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11983.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer, unit in seconds: point in time when all win notifications will recover; default 1/2 + 10000 day&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11984.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--seed&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;seed&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11985.                        &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;I went to the woods because I wished to live deliberately, to front only the
  11986.                                   essential facts of life, and see if I could
  11987.                                   not learn what it had to teach, and not, when
  11988.                                   I came to die, discover that I had not lived.
  11989.                                   I did not wish to live what was not life,
  11990.                                   living is so dear; nor did I wish to practice
  11991.                                   resignation, unless it was quite
  11992.                                   necessary.&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11993.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;hashable: seed for the simulation random number&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  11994.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--initial_credits&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11995.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer: total number of credits available to the system on startup&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  11996.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--credit_cap&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11997.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer: maximum number of credits the system can accumulate&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  11998.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--bid_request_rate_per_second&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  11999.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer, units in bid/sec: total number of requests per second in the simulation&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12000.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--win_rate&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;float&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12001.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;float, [0,1.0]: The percentage of time a bid wins.&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12002.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--bid_rate&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;float&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12003.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;float, [0,1.0]: The percentage of time we make a bid.&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12004.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--decriment_per_bid&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12005.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer: total credits a single bid consumes&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12006.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--incriment_per_win&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12007.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer: total credits a single win provides&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12008.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--periodic_credit_bump&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12009.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer, units in seconds: periodicity of automatic credit replenishment&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12010.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;--periodic_credit_bump_amount&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12011.                        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;integer: amount of credits per replenishment cycle&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12012.  
  12013.    &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parse_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  12014.  
  12015.    &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;World&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12016.              &lt;span class=&quot;n&quot;&gt;initial_credits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initial_credits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12017.              &lt;span class=&quot;n&quot;&gt;win_kill_point&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;win_kill_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12018.              &lt;span class=&quot;n&quot;&gt;win_recover_point&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;win_recover_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12019.              &lt;span class=&quot;n&quot;&gt;credit_cap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credit_cap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12020.              &lt;span class=&quot;n&quot;&gt;time_step&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time_step&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12021.              &lt;span class=&quot;n&quot;&gt;time_end_lower_limit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time_end_lower_limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12022.              &lt;span class=&quot;n&quot;&gt;bid_request_rate_per_second&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bid_request_rate_per_second&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12023.              &lt;span class=&quot;n&quot;&gt;win_rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;win_rate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12024.              &lt;span class=&quot;n&quot;&gt;bid_rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bid_rate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12025.              &lt;span class=&quot;n&quot;&gt;decriment_per_bid&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;decriment_per_bid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12026.              &lt;span class=&quot;n&quot;&gt;incriment_per_win&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;incriment_per_win&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12027.              &lt;span class=&quot;n&quot;&gt;periodic_credit_bump&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;periodic_credit_bump&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12028.              &lt;span class=&quot;n&quot;&gt;periodic_credit_bump_amount&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;periodic_credit_bump_amount&lt;/span&gt;
  12029.          &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12030.  
  12031.    &lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\t&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;
  12032.    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;# &apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;current_time&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;credits&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;free_bids&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;bids_submitted&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;outbound_bids&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;instantaneous_wins&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;instantaneous_bids&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;time_of_day_adjustment&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;total_bids&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;total_wins&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;dead_air&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  12033.    &lt;span class=&quot;n&quot;&gt;step&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;step&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  12034.    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;step&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  12035.        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;step&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  12036.        &lt;span class=&quot;n&quot;&gt;step&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;step&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12037.  
  12038. &lt;p&gt;and an even smaller bit of gnuplot:&lt;/p&gt;
  12039.  
  12040. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#!/usr/bin/env gnuplot
  12041. #
  12042. # Plotting data about the credit model (see credit_model.dat)
  12043.  
  12044. reset
  12045.  
  12046. # png
  12047. set terminal png size 2560,1440 enhanced font &apos;Verdana,10&apos;
  12048. set output &apos;credit_model.png&apos;
  12049.  
  12050. set xlabel &apos;Time (seconds)&apos;
  12051. set tics scale 0.75
  12052.  
  12053. plot &apos;credit_model.dat&apos; using 1:2 smooth acsplines with lines title &apos;Credits Available&apos;, \
  12054.     &apos;credit_model.dat&apos; using 1:4 smooth acsplines with lines title &apos;Bids Submitted&apos;, \
  12055.     &apos;credit_model.dat&apos; using 1:5 smooth acsplines with lines title &apos;Outbound Bids&apos;, \
  12056.     &apos;credit_model.dat&apos; using 1:6 smooth acsplines with lines title &apos;Instantaneous Wins&apos;
  12057. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  12058.  
  12059. &lt;p&gt;let the team understand how the system would behave in a no-risk manner. You’ll
  12060. notice that the python code is simulating more than the brief sketch above:
  12061. waning traffic through the day, random fluctuation around rates. Here’s
  12062. a similar periodic crash scenario graphed:&lt;/p&gt;
  12063.  
  12064. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;blt&amp;gt; python credit_model.py --time_end_lower_limit=1800
  12065. --bid_request_rate_per_second=1000 --initial_credits=1000 --time_step=1
  12066. --credit_cap=1000 --periodic_credit_bump=60
  12067. --periodic_credit_bump_amount=1000 --decriment_per_bid=1
  12068. --incriment_per_win=1 --bid_rate=1.0 --win_rate=0.1  &amp;gt; credit_model.dat &amp;amp;&amp;amp;
  12069. ./credit_model.gnu
  12070. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  12071.  
  12072. &lt;p&gt;&lt;img src=&quot;/images/post_images/bidding_credits/0000-boom_and_bust.png&quot; alt=&quot;Boom and Bust Graph&quot; /&gt;&lt;/p&gt;
  12073.  
  12074. &lt;p&gt;Crummy, right?&lt;/p&gt;
  12075.  
  12076. &lt;p&gt;Here’s another scenario with W = 10, effectively canceling out all the losing
  12077. bids to get to a win.&lt;/p&gt;
  12078.  
  12079. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;blt&amp;gt; python credit_model.py --time_end_lower_limit=1800
  12080. --bid_request_rate_per_second=1000 --initial_credits=1000 --time_step=1
  12081. --credit_cap=1000 --periodic_credit_bump=60
  12082. --periodic_credit_bump_amount=1000 --decriment_per_bid=1
  12083. --incriment_per_win=10 --bid_rate=1.0 --win_rate=0.1  &amp;gt; credit_model.dat &amp;amp;&amp;amp;
  12084. ./credit_model.gnu
  12085. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  12086.  
  12087. &lt;p&gt;&lt;img src=&quot;/images/post_images/bidding_credits/0001-stable.png&quot; alt=&quot;Stable Graph&quot; /&gt;&lt;/p&gt;
  12088.  
  12089. &lt;p&gt;Configured well, the system is stable in normal operation. Good! What about if
  12090. there’s a win decoupling event? The careful reader will have noticed that the
  12091. python script has the ability to simulate a decoupling. Here’s the stable system
  12092. with a decouple thrown in:&lt;/p&gt;
  12093.  
  12094. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;blt&amp;gt; python credit_model.py --time_end_lower_limit=1800
  12095. --bid_request_rate_per_second=1000 --initial_credits=1000 --time_step=1
  12096. --credit_cap=1000 --periodic_credit_bump=60 --periodic_credit_bump_amount=1000
  12097. --decriment_per_bid=1 --incriment_per_win=10 --bid_rate=1.0 --win_rate=0.1
  12098. --win_kill_point=900 --win_recover_point=1200  &amp;gt; credit_model.dat &amp;amp;&amp;amp;
  12099. ./credit_model.gnu
  12100. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  12101.  
  12102. &lt;p&gt;&lt;img src=&quot;/images/post_images/bidding_credits/0002-crash_recover.png&quot; alt=&quot;Recover Graph&quot; /&gt;&lt;/p&gt;
  12103.  
  12104. &lt;p&gt;You’ll note it’s got the properties we desire, a clamping of bidding and a rapid
  12105. jump back to full capacity once wins start entering the system again. At no
  12106. point do bids ever permanently drop to zero–the periodic credit bump keeps us
  12107. issuing a moderate number bid submissions–but the fall-off is drastic enough to
  12108. put the loss in the realm of pennies.&lt;/p&gt;
  12109.  
  12110. &lt;p&gt;And it turns out, it works well in practice, too:&lt;/p&gt;
  12111.  
  12112. &lt;p&gt;&lt;img src=&quot;/images/post_images/bidding_credits/0003-works_in_practice.jpg&quot; alt=&quot;Real Recover Graph&quot; /&gt;&lt;/p&gt;
  12113. </description>
  12114.    </item>
  12115.    
  12116.    
  12117.    
  12118.    <item>
  12119.      <title>Spark on EMR</title>
  12120.      <link>https://tech.nextroll.com/blog/spark/2016/01/25/spark-on-emr.html</link>
  12121.      <pubDate>Mon, 25 Jan 2016 00:00:00 -0800</pubDate>
  12122.      <author></author>
  12123.      <guid isPermaLink="false">https://tech.nextroll.com/blog/spark/2016/01/25/spark-on-emr</guid>
  12124.      <description>&lt;h4 id=&quot;high-level-problem&quot;&gt;High Level Problem&lt;/h4&gt;
  12125.  
  12126. &lt;p&gt;Our Dynamic Ads product is designed to show a user ads with specific, recommended products based on their browsing history. Each advertiser gives us a list of all of its products beforehand so that we can generate a “product catalog” for the advertiser. Sometimes, however, the product catalog that the advertiser has given us may not accurately reflect all of the products that exist on their website, and our recommendations become less accurate. Thus, we wanted to run a daily data processing job that would look at all products viewed on a given day and determine if we had a matching record in the advertiser’s product catalog. Then, we could aggregate that data to calculate a “match rate” for each advertiser, allowing us to easily target and troubleshoot troubled product catalogs.&lt;/p&gt;
  12127.  
  12128. &lt;p&gt;In order to accomplish this task, we needed to perform the following actions:&lt;/p&gt;
  12129.  
  12130. &lt;ul&gt;
  12131.  &lt;li&gt;Query &lt;a href=&quot;http://tech.adroll.com/blog/adroll/2015/05/04/aws-summit-keynote.html&quot;&gt;AdRoll’s raw logs&lt;/a&gt; via &lt;a href=&quot;https://prestodb.io/&quot;&gt;Presto&lt;/a&gt; for initial data set (8 million rows).&lt;/li&gt;
  12132.  &lt;li&gt;For each row in the data set, query DynamoDB.&lt;/li&gt;
  12133.  &lt;li&gt;Add each new row to a table in a PostgreSQL database (Amazon RDS).&lt;/li&gt;
  12134.  &lt;li&gt;Summarize (reduce) the data set to get an aggregated view.&lt;/li&gt;
  12135.  &lt;li&gt;Add each row to another aggregated table in the PostgreSQL database.&lt;/li&gt;
  12136. &lt;/ul&gt;
  12137.  
  12138. &lt;h4 id=&quot;why-pyspark&quot;&gt;Why PySpark?&lt;/h4&gt;
  12139.  
  12140. &lt;p&gt;Given that we were faced with a large-scale data processing task, it became clear that we would need to use a cluster computing solution. As a new engineer who is most familiar with Python, I was initially drawn to Spark (as opposed to Hadoop) because I would be able to use &lt;a href=&quot;https://spark.apache.org/docs/0.9.0/python-programming-guide.html&quot;&gt;PySpark&lt;/a&gt;, a Python wrapper around Spark.  While there are other Python libraries to perform mapreduce tasks (like &lt;a href=&quot;https://pythonhosted.org/mrjob/&quot;&gt;mrjob&lt;/a&gt;), PySpark was the most attractive of the Python options due to its intuitive, concise syntax and its easy maintainability. Moreover, the performance gains from using Spark were also a major contributing factor.&lt;/p&gt;
  12141.  
  12142. &lt;p&gt;While the PySpark code needed to build the necessary RDDs and write to the database ended up being rather trivial, I ran into numerous operational roadblocks around deploying and running PySpark on EMR.  Many of these issues stemmed from the fact that despite the new EMR 4.1.0 AMI supporting Spark and PySpark out of the box, it was difficult to find documentation on configuring Spark to optimize resource allocation and deploying batch applications. Among the various issues I encountered and ultimately solved were: installing Python dependencies, setting Python 2.7 as the default, configuring executor and driver memory, and running the spark-submit step.  In this post, I present a guide based on my experience that I hope serves to smooth the development process for future users.&lt;/p&gt;
  12143.  
  12144. &lt;h4 id=&quot;spark&quot;&gt;Spark&lt;/h4&gt;
  12145.  
  12146. &lt;ul&gt;
  12147.  &lt;li&gt;Transformations vs. Actions
  12148.    &lt;ul&gt;
  12149.      &lt;li&gt;Spark RDDs support two types of operations: &lt;a href=&quot;http://spark.apache.org/docs/latest/programming-guide.html&quot;&gt;transformations and actions&lt;/a&gt;. As the documentation clearly explains, transformations (like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt;) create a new dataset from an existing one, but they are lazy, meaning that they do not compute their results right away. In other words, transformations simply remember which transformations were applied to the base data. Actions (like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;foreachPartition&lt;/code&gt;), on the other hand, return a value to the driver after running a computation.  The computations associated with a transformation are not actually computed until an action is called.&lt;/li&gt;
  12150.      &lt;li&gt;An understanding of this difference is hugely important in actually leveraging Spark’s performance gains. For example, prior to understanding this crucial difference, I was actually making the 8 million DynamoDB calls twice, instead of just once.  With this understanding, I was able to decrease my job runtime by more than 50%.&lt;/li&gt;
  12151.      &lt;li&gt;Below, I have shown the difference between the code before and after this realization. Depending on the size of your data set and the memory available, you can either call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.persist()&lt;/code&gt; on the RDD that you will reuse (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rdd1&lt;/code&gt; in our case) or write the data to disk. Due to the size of our data set and the fact that the intermediary RDD was precisely what I wanted to store in the database, I chose to write &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rdd1&lt;/code&gt; to the database, and then query the database to initiate a new RDD.&lt;/li&gt;
  12152.    &lt;/ul&gt;
  12153.  &lt;/li&gt;
  12154. &lt;/ul&gt;
  12155.  
  12156. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;    &lt;span class=&quot;c1&quot;&gt;##### INEFFICIENT CODE #####
  12157. &lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;base_rdd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parallelize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;presto_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12158.  
  12159.    &lt;span class=&quot;c1&quot;&gt;# We do not actually query DynamoDB until foreachPartition is called,
  12160. &lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# as map is a transformation and foreachPartition is an action.
  12161. &lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;rdd1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;base_rdd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;query_dynamo_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12162.    &lt;span class=&quot;n&quot;&gt;rdd1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;foreachPartition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_to_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12163.  
  12164.    &lt;span class=&quot;c1&quot;&gt;# At this point, because query_dynamo_db was within a map function
  12165. &lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# (a transformation, which is remembered by rdd1), we actually query
  12166. &lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# DynamoDB all over again when foreachPartition is called.
  12167. &lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;rdd2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rdd1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reduceByKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12168.    &lt;span class=&quot;n&quot;&gt;rdd2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;foreachPartition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_to_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12169.    &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12170.  
  12171. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;    &lt;span class=&quot;c1&quot;&gt;##### IMPROVED CODE #####
  12172. &lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;base_rdd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parallelize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;presto_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12173.    &lt;span class=&quot;n&quot;&gt;rdd1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;base_rdd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;query_dynamo_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12174.    &lt;span class=&quot;n&quot;&gt;rdd1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;foreachPartition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_to_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12175.  
  12176.    &lt;span class=&quot;c1&quot;&gt;# Here we build the second RDD based on the data which was added
  12177. &lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# to the database in the first foreachPartition action, instead of
  12178. &lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# based upon rdd1. Thus, we do not repeat the DynamoDB calls.
  12179. &lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;db_rows&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_rows_from_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  12180.    &lt;span class=&quot;n&quot;&gt;new_rdd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parallelize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db_rows&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12181.    &lt;span class=&quot;n&quot;&gt;rdd2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_rdd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reduceByKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12182.    &lt;span class=&quot;n&quot;&gt;rdd2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;foreachPartition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_to_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12183.    &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12184.  
  12185. &lt;ul&gt;
  12186.  &lt;li&gt;Partitions
  12187.    &lt;ul&gt;
  12188.      &lt;li&gt;An important parameter when considering how to parallelize your data is the number of partitions to use to split up your dataset.&lt;/li&gt;
  12189.      &lt;li&gt;Spark runs one task (action) per partition, so it’s important to partition your data appropriately.&lt;/li&gt;
  12190.      &lt;li&gt;Spark’s documentation recommends 2-4 partitions for each CPU in the cluster.&lt;/li&gt;
  12191.      &lt;li&gt;RDDs created from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;textFile&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hadoopFile&lt;/code&gt; will use the mapreduce API to determine the number of partitions and thus will have a reasonable predetermined number of partitions.&lt;/li&gt;
  12192.      &lt;li&gt;On the other hand, the data set returned from my Presto query returns a number of partitions determined by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;spark.default.parallelism&lt;/code&gt; setting. By default, this setting sets the number of partitions to the number of executors (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;spark.executor.instances&lt;/code&gt;, 20 in our case) or 2, whichever is greater.  Thus, in order to optimize, we manually tuned the number of partitions to be equal to 80 (20 executors * 4 partitions each).&lt;/li&gt;
  12193.    &lt;/ul&gt;
  12194.  &lt;/li&gt;
  12195. &lt;/ul&gt;
  12196.  
  12197. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;  &lt;span class=&quot;n&quot;&gt;base_rdd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parallelize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;presto_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;80&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12198.  &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12199.  
  12200. &lt;ul&gt;
  12201.  &lt;li&gt;Importing from Self-created Modules
  12202.    &lt;ul&gt;
  12203.      &lt;li&gt;Unlike in base Python, where one is able to import from any other Python file within a given repository, in Pyspark, external self-created libraries need to be zipped up and then added to the global SparkContext object.&lt;/li&gt;
  12204.      &lt;li&gt;First the SparkContext object must be created, and then the zip file is added to the object via the &lt;a href=&quot;https://spark.apache.org/docs/0.7.2/api/pyspark/pyspark.context.SparkContext-class.html#addPyFile&quot;&gt;addPyFiles&lt;/a&gt; method.&lt;/li&gt;
  12205.    &lt;/ul&gt;
  12206.  &lt;/li&gt;
  12207. &lt;/ul&gt;
  12208.  
  12209. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;  &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyspark&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SparkContext&lt;/span&gt;
  12210.  &lt;span class=&quot;n&quot;&gt;sc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SparkContext&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  12211.  &lt;span class=&quot;n&quot;&gt;sc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;addPyFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;/home/hadoop/spark_app.zip&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12212.  &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12213.  
  12214. &lt;ul&gt;
  12215.  &lt;li&gt;Using Sqlalchemy
  12216.    &lt;ul&gt;
  12217.      &lt;li&gt;Because of the distributed nature of Spark, a Sqlalchemy &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Session&lt;/code&gt; object can not be shared across multiple executors or paritions.&lt;/li&gt;
  12218.      &lt;li&gt;For this reason, a new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Session&lt;/code&gt; object must be created per partition, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sqlalchemy.create_engine&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sqlalchemy.orm.sessionmaker&lt;/code&gt; must be imported within each partion. All of this is done within a function called by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;foreachPartition&lt;/code&gt;.&lt;/li&gt;
  12219.    &lt;/ul&gt;
  12220.  &lt;/li&gt;
  12221. &lt;/ul&gt;
  12222.  
  12223. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_to_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iterator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  12224.    &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sqlalchemy&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;create_engine&lt;/span&gt;
  12225.    &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sqlalchemy.orm&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sessionmaker&lt;/span&gt;
  12226.    &lt;span class=&quot;n&quot;&gt;engine&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;create_engine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DATABASE_URL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12227.    &lt;span class=&quot;n&quot;&gt;session&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sessionmaker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bind&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;engine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12228.    &lt;span class=&quot;n&quot;&gt;Session&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;session&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  12229.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iterator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  12230.        &lt;span class=&quot;n&quot;&gt;Session&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12231.    &lt;span class=&quot;n&quot;&gt;Session&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;commit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  12232.  
  12233.  &lt;span class=&quot;n&quot;&gt;my_rdd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;foreachPartition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_to_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12234.  &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12235.  
  12236. &lt;h4 id=&quot;emr&quot;&gt;EMR&lt;/h4&gt;
  12237. &lt;ul&gt;
  12238.  &lt;li&gt;Packaging &amp;amp; Deploying Code
  12239.    &lt;ul&gt;
  12240.      &lt;li&gt;The PySpark code was organized as follows:
  12241.        &lt;ul&gt;
  12242.          &lt;li&gt;Use Presto to query raw logs on S3 each day for the count (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;view_count&lt;/code&gt;) of product views per product_id.&lt;/li&gt;
  12243.          &lt;li&gt;Convert the results of this query into an RDD.&lt;/li&gt;
  12244.          &lt;li&gt;Using Spark, map over this RDD to query DynamoDB to see if each row in the RDD has a matching record in DynamoDB. The result of this mapping function is a new RDD (named “Product RDD” in the diagram below), which now has a boolean field &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;matched&lt;/code&gt; to indicate whether or not a matching record was found.&lt;/li&gt;
  12245.          &lt;li&gt;Write the contents of this new RDD to the Postgres database, an AWS RDS instance.&lt;/li&gt;
  12246.          &lt;li&gt;Reduce this data to the advertiser level, giving a summary of the number of matched views vs. total views by advertiser by day.&lt;/li&gt;
  12247.        &lt;/ul&gt;
  12248.      &lt;/li&gt;
  12249.    &lt;/ul&gt;
  12250.  
  12251.    &lt;p&gt;&lt;img src=&quot;/images/post_images/pyspark-script-overview-image.png&quot; alt=&quot;PySpark Script Overview&quot; title=&quot;PySpark Script Overview&quot; /&gt;&lt;/p&gt;
  12252.  
  12253.    &lt;ul&gt;
  12254.      &lt;li&gt;As discussed above, in order to access self-created Python modules from within my main PySpark script, the modules had to be zipped up. My solution to this was to create a main module called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;spark_app&lt;/code&gt;, within which I had sub-modules &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;model&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;utils&lt;/code&gt;. I then created a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deploy.py&lt;/code&gt; script that serves to zip up the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;spark_app&lt;/code&gt; directory and push the zipped file to a bucket on S3.&lt;/li&gt;
  12255.      &lt;li&gt;Similarly, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deploy.py&lt;/code&gt; script also pushes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;process_data.py&lt;/code&gt; (the main PySpark script) and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;setup.sh&lt;/code&gt; (the bootstrap action script) to the same bucket on S3.&lt;/li&gt;
  12256.      &lt;li&gt;Bootstrap actions are run before your steps run. In this way, they are used to set up your cluster appropriately. For our purposes, that involved installing my necessary Python dependencies and copying &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;spark_app.zip&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;process_data.py&lt;/code&gt; to the right directory.&lt;/li&gt;
  12257.      &lt;li&gt;Below is our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;setup.sh&lt;/code&gt; script:&lt;/li&gt;
  12258.    &lt;/ul&gt;
  12259.  &lt;/li&gt;
  12260. &lt;/ul&gt;
  12261.  
  12262. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;  &lt;span class=&quot;c&quot;&gt;#!/bin/bash&lt;/span&gt;
  12263.  
  12264.  &lt;span class=&quot;c&quot;&gt;# Install our dependencies - replace these libraries with your needs!&lt;/span&gt;
  12265.  &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;yum &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-y&lt;/span&gt; gcc python-setuptools python-devel postgresql-devel
  12266.  &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;python2.7 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; pip &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;SQLAlchemy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;1.0.8
  12267.  &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;python2.7 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; pip &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;boto&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;2.38.0
  12268.  &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;python2.7 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; pip &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;funcsigs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;0.4
  12269.  &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;python2.7 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; pip &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;pbr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;1.8.1
  12270.  &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;python2.7 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; pip &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;psycopg2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;2.6.1
  12271.  &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;python2.7 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; pip &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;six&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;1.10.0
  12272.  &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;python2.7 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; pip &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;wsgiref&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;0.1.2
  12273.  &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;python2.7 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; pip &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;requests[security]
  12274.  
  12275.  &lt;span class=&quot;c&quot;&gt;# Download code from S3 and set up cluster&lt;/span&gt;
  12276.  aws s3 &lt;span class=&quot;nb&quot;&gt;cp &lt;/span&gt;s3://your-bucket/spark_app.zip /home/hadoop/spark_app.zip
  12277.  aws s3 &lt;span class=&quot;nb&quot;&gt;cp &lt;/span&gt;s3://your-bucket/process_data.py /home/hadoop/process_data.py
  12278.  &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12279.  
  12280. &lt;ul&gt;
  12281.  &lt;li&gt;Running the Job
  12282.    &lt;ul&gt;
  12283.      &lt;li&gt;Ultimately, I chose to run this Spark application using a scheduled Lambda function on AWS (discussed below), but for the purposes of this blog post, I have included both the Lambda function which utilizes &lt;a href=&quot;http://boto3.readthedocs.org/en/latest/reference/services/emr.html#EMR.Client.run_job_flow&quot;&gt;boto3&lt;/a&gt; and the equivalent cluster launch using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;aws cli&lt;/code&gt;.&lt;/li&gt;
  12284.      &lt;li&gt;AWS Lambda Function:&lt;/li&gt;
  12285.    &lt;/ul&gt;
  12286.  &lt;/li&gt;
  12287. &lt;/ul&gt;
  12288.  
  12289. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;  &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;boto3&lt;/span&gt;
  12290.  
  12291.  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;lambda_handler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;json_input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  12292.      &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;boto3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;emr&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;region_name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;us-west-2&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12293.  
  12294.      &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_job_flow&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  12295.          &lt;span class=&quot;n&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;YourApp&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12296.          &lt;span class=&quot;n&quot;&gt;ReleaseLabel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;emr-4.1.0&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12297.          &lt;span class=&quot;n&quot;&gt;Instances&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  12298.              &lt;span class=&quot;s&quot;&gt;&apos;MasterInstanceType&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;m3.xlarge&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12299.              &lt;span class=&quot;s&quot;&gt;&apos;SlaveInstanceType&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;m3.xlarge&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12300.              &lt;span class=&quot;s&quot;&gt;&apos;InstanceCount&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;21&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12301.              &lt;span class=&quot;s&quot;&gt;&apos;Ec2KeyName&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;ops&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12302.              &lt;span class=&quot;s&quot;&gt;&apos;KeepJobFlowAliveWhenNoSteps&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12303.              &lt;span class=&quot;s&quot;&gt;&apos;TerminationProtected&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12304.              &lt;span class=&quot;s&quot;&gt;&apos;Ec2SubnetId&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;subnet-XXXX
  12305.          },
  12306.          Steps=[
  12307.              {
  12308.                  &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;YourStep&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12309.                  &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ActionOnFailure&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TERMINATE_CLUSTER&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12310.                  &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HadoopJarStep&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: {
  12311.                      &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Jar&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;runner&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jar&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12312.                      &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: [
  12313.                          &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;submit&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12314.                          &apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;--&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;driver&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,&apos;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12315.                          &apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;--&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;executor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,&apos;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12316.                          &apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;--&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;executor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cores&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,&apos;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12317.                          &apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;--&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;executors&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,&apos;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12318.                          &apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;home&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hadoop&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;process_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;py&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;
  12319.                      ]
  12320.                  }
  12321.              },
  12322.          ],
  12323.          BootstrapActions=[
  12324.              {
  12325.                  &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cluster_setup&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12326.                  &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ScriptBootstrapAction&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: {
  12327.                      &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;your&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bucket&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;subfolder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;setup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sh&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12328.                      &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: []
  12329.                  }
  12330.              }
  12331.          ],
  12332.          Applications=[
  12333.              {
  12334.                  &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Spark&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;
  12335.              },
  12336.          ],
  12337.          Configurations=[
  12338.              {
  12339.                  &quot;Classification&quot;: &quot;spark-env&quot;,
  12340.                  &quot;Properties&quot;: {
  12341.  
  12342.                  },
  12343.                  &quot;Configurations&quot;: [
  12344.                      {
  12345.                          &quot;Classification&quot;: &quot;export&quot;,
  12346.                          &quot;Properties&quot;: {
  12347.                              &quot;PYSPARK_PYTHON&quot;: &quot;/usr/bin/python2.7&quot;,
  12348.                              &quot;PYSPARK_DRIVER_PYTHON&quot;: &quot;/usr/bin/python2.7&quot;
  12349.                          },
  12350.                          &quot;Configurations&quot;: [
  12351.  
  12352.                          ]
  12353.                      }
  12354.                  ]
  12355.              },
  12356.              {
  12357.                  &quot;Classification&quot;: &quot;spark-defaults&quot;,
  12358.                  &quot;Properties&quot;: {
  12359.                      &quot;spark.akka.frameSize&quot;: &quot;2047&quot;
  12360.                  }
  12361.              }
  12362.          ],
  12363.          VisibleToAllUsers=True,
  12364.          JobFlowRole=&apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EMR_EC2_DefaultRole&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;,
  12365.          ServiceRole=&apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EMR_DefaultRole&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;
  12366.      )
  12367.  &lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12368.  
  12369. &lt;ul&gt;
  12370.  &lt;li&gt;AWS CLI Equivalent:&lt;/li&gt;
  12371. &lt;/ul&gt;
  12372.  
  12373. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;  aws emr create-cluster &lt;span class=&quot;nt&quot;&gt;--name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;YourApp&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--release-label&lt;/span&gt; emr-4.1.0 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12374.  &lt;span class=&quot;nt&quot;&gt;--use-default-roles&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12375.  &lt;span class=&quot;nt&quot;&gt;--ec2-attributes&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;KeyName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;YourKey,SubnetId&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;YourSubnet &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12376.  &lt;span class=&quot;nt&quot;&gt;--applications&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Spark &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12377.  &lt;span class=&quot;nt&quot;&gt;--configurations&lt;/span&gt; https://bucket.s3.amazonaws.com/key/pyspark_py27.json &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12378.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; us-west-2 &lt;span class=&quot;nt&quot;&gt;--instance-count&lt;/span&gt; 21 &lt;span class=&quot;nt&quot;&gt;--instance-type&lt;/span&gt; m3.xlarge &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12379.  &lt;span class=&quot;nt&quot;&gt;--bootstrap-action&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;s3://your-bucket/path/to/script&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12380.  &lt;span class=&quot;nt&quot;&gt;--steps&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Spark,Name&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;StepName&quot;&lt;/span&gt;,ActionOnFailure&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;TERMINATE_CLUSTER,&lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12381.          &lt;span class=&quot;nv&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=[&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;--driver-memory&lt;/span&gt;,10G,&lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12382.                &lt;span class=&quot;nt&quot;&gt;--executor-memory&lt;/span&gt;,4G,&lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12383.                &lt;span class=&quot;nt&quot;&gt;--executor-cores&lt;/span&gt;,4,&lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12384.                &lt;span class=&quot;nt&quot;&gt;--num-executors&lt;/span&gt;,20,&lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  12385.                /home/hadoop/process_data.py]
  12386.  &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12387.  
  12388. &lt;ul&gt;
  12389.  &lt;li&gt;Notes on Parameters:
  12390.    &lt;ul&gt;
  12391.      &lt;li&gt;Configurations
  12392.        &lt;ul&gt;
  12393.          &lt;li&gt;Because we were also accessing data from Presto within the main Pyspark script, it was necessary that all of our machines were using Python 2.7 as the default (Hive/Presto requirement). This involved configuring &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PYSPARK_PYTHON&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PYSPARK_DRIVER_PYTHON&lt;/code&gt; environment variables.&lt;/li&gt;
  12394.          &lt;li&gt;Additionally, we needed to add a special configuration to increase the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;spark.akka.frameSize&lt;/code&gt; parameter. The maximum frame size is 2047, so that is what we have set it to. This may be unnecessary for some applications, but if you receive an error about frameSize, you will know that you need to add in a configuration for this parameter.&lt;/li&gt;
  12395.          &lt;li&gt;Note: the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--configurations&lt;/code&gt; parameter in the AWS CLI example simply provides a url to a json file stored on S3. That file should contain the json blob from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Configurations&lt;/code&gt; in the boto3 example above.&lt;/li&gt;
  12396.        &lt;/ul&gt;
  12397.      &lt;/li&gt;
  12398.      &lt;li&gt;Steps
  12399.        &lt;ul&gt;
  12400.          &lt;li&gt;Within the Spark step, you can pass in Spark parameters to configure the job to meet your needs. In our case, I needed to increase both the driver and executor memory parameters, along with specifying the number of cores to use on each executor. We also found that we needed to explicitly stipulate that Spark use all 20 executors we had provisioned.&lt;/li&gt;
  12401.          &lt;li&gt;The final argument in the list is the path to the main PySpark script (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;process_data.py&lt;/code&gt;) that runs the actual data processing job.&lt;/li&gt;
  12402.        &lt;/ul&gt;
  12403.      &lt;/li&gt;
  12404.      &lt;li&gt;Applications
  12405.        &lt;ul&gt;
  12406.          &lt;li&gt;You must explicitly specify that you intend to run a Spark application.&lt;/li&gt;
  12407.        &lt;/ul&gt;
  12408.      &lt;/li&gt;
  12409.    &lt;/ul&gt;
  12410.  &lt;/li&gt;
  12411.  &lt;li&gt;Scheduling the Job
  12412.    &lt;ul&gt;
  12413.      &lt;li&gt;With the &lt;a href=&quot;https://aws.amazon.com/blogs/aws/aws-lambda-update-python-vpc-increased-function-duration-scheduling-and-more/&quot;&gt;recent announcement&lt;/a&gt; of Amazon’s newly available ‘Scheduled Event’ feature within Lambda, we were able to very easily set up a recurring cluster launch on a cron schedule.&lt;/li&gt;
  12414.      &lt;li&gt;As shown above, the code to launch the cluster was extremely simple, leveraging the power of boto3’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_job_flow&lt;/code&gt; method.&lt;/li&gt;
  12415.      &lt;li&gt;The only wrinkle here was properly configuring the role for the lambda function, so that it would have the right permissions to launch an EMR cluster. It was necessary that the action “elasticmapreduce:RunJobFlow” had the “iam:PassRole” role associated with it.&lt;/li&gt;
  12416.      &lt;li&gt;Below is the policy for the role:&lt;/li&gt;
  12417.    &lt;/ul&gt;
  12418.  &lt;/li&gt;
  12419. &lt;/ul&gt;
  12420.  
  12421. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;  &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  12422.    &lt;span class=&quot;s2&quot;&gt;&quot;Version&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;2012-10-17&quot;&lt;/span&gt;,
  12423.    &lt;span class=&quot;s2&quot;&gt;&quot;Statement&quot;&lt;/span&gt;: &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;
  12424.        &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  12425.            &lt;span class=&quot;s2&quot;&gt;&quot;Effect&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;Allow&quot;&lt;/span&gt;,
  12426.            &lt;span class=&quot;s2&quot;&gt;&quot;Action&quot;&lt;/span&gt;: &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;
  12427.                &lt;span class=&quot;s2&quot;&gt;&quot;logs:CreateLogGroup&quot;&lt;/span&gt;,
  12428.                &lt;span class=&quot;s2&quot;&gt;&quot;logs:CreateLogStream&quot;&lt;/span&gt;,
  12429.                &lt;span class=&quot;s2&quot;&gt;&quot;logs:PutLogEvents&quot;&lt;/span&gt;
  12430.            &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;,
  12431.            &lt;span class=&quot;s2&quot;&gt;&quot;Resource&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;arn:aws:logs:*:*:*&quot;&lt;/span&gt;
  12432.        &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;,
  12433.        &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  12434.            &lt;span class=&quot;s2&quot;&gt;&quot;Effect&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;Allow&quot;&lt;/span&gt;,
  12435.            &lt;span class=&quot;s2&quot;&gt;&quot;Action&quot;&lt;/span&gt;: &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;
  12436.                &lt;span class=&quot;s2&quot;&gt;&quot;elasticmapreduce:RunJobFlow&quot;&lt;/span&gt;,
  12437.                &lt;span class=&quot;s2&quot;&gt;&quot;iam:PassRole&quot;&lt;/span&gt;
  12438.            &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;,
  12439.            &lt;span class=&quot;s2&quot;&gt;&quot;Resource&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;*&quot;&lt;/span&gt;
  12440.        &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  12441.      &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
  12442.    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  12443.  &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12444.  
  12445. &lt;h4 id=&quot;closing-thoughts&quot;&gt;Closing Thoughts&lt;/h4&gt;
  12446. &lt;p&gt;While there were several hurdles to overcome in order to get this PySpark application running smoothly on EMR, we are now extremely happy with the successful and smooth operation of the daily job. I hope this guide has been helpful for future PySpark and EMR users.&lt;/p&gt;
  12447.  
  12448. &lt;p&gt;Finally, after taking a one-day Spark course, I have realized that for performance reasons, it likely makes sense to use Spark’s dataframes instead of RDDs for some parts of this job. Stay tuned for a future post on transitioning to dataframes and (hopefully) resulting performance gains!&lt;/p&gt;
  12449.  
  12450. </description>
  12451.    </item>
  12452.    
  12453.    
  12454.    
  12455.    <item>
  12456.      <title>gulp-react-docs: From propTypes to Markdown in 3 seconds</title>
  12457.      <link>https://tech.nextroll.com/blog/frontend/2015/12/21/gulp-react-docs.html</link>
  12458.      <pubDate>Mon, 21 Dec 2015 00:00:00 -0800</pubDate>
  12459.      <author></author>
  12460.      <guid isPermaLink="false">https://tech.nextroll.com/blog/frontend/2015/12/21/gulp-react-docs</guid>
  12461.      <description>&lt;p&gt;In a &lt;a href=&quot;http://tech.adroll.com/blog/frontend/2015/11/12/rollup-react-and-npm-at-adroll.html&quot;&gt;blog post a couple weeks ago&lt;/a&gt; we talked about the developer tools we built for Rollup, our UI component library. One of those tools is responsible for &lt;a href=&quot;http://tech.adroll.com/blog/frontend/2015/11/12/rollup-react-and-npm-at-adroll.html#automatic-documentation-generation&quot;&gt;automatically generating documentation&lt;/a&gt; for React components and we just released it as an open-source &lt;a href=&quot;http://gulpjs.com/&quot;&gt;gulp&lt;/a&gt; plugin called &lt;a href=&quot;https://www.npmjs.com/package/gulp-react-docs&quot;&gt;gulp-react-docs&lt;/a&gt;.&lt;/p&gt;
  12462.  
  12463. &lt;p&gt;When we started working on Rollup, we knew that we needed to make it easy for application developers to figure out what &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;props&lt;/code&gt; a component expects. To us that meant providing documentation so that application developers wouldn’t need to look for the entry &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.jsx&lt;/code&gt; file and then examine the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;propTypes&lt;/code&gt;. We first looked for packages that we could run our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.jsx&lt;/code&gt; files through that would then output Markdown based on the component comments and props, but couldn’t find any that did just that. So we decided to build our own documentation generator based on &lt;a href=&quot;https://www.npmjs.com/package/react-docgen&quot;&gt;react-docgen&lt;/a&gt;.&lt;/p&gt;
  12464.  
  12465. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;react-docgen&lt;/code&gt; parses React component files into an &lt;a href=&quot;https://en.wikipedia.org/wiki/Abstract_syntax_tree&quot;&gt;AST&lt;/a&gt; in JSON format. We then take the output and pipe it through &lt;a href=&quot;https://github.com/AdRoll/gulp-react-docs/blob/master/src/react-docgen-md.js&quot;&gt;some clever Handlebars templates&lt;/a&gt; to produce a Markdown document for each React component. This process is encapsulated in our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gulp-react-docs&lt;/code&gt; plugin. We decided to implement our documentation generator as a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gulp&lt;/code&gt; plugin to make it easy to generate documentation for multiple components at a time.&lt;/p&gt;
  12466.  
  12467. &lt;p&gt;So, using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gulp-react-docs&lt;/code&gt; we do something like this in our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gulpfile.js&lt;/code&gt;:&lt;/p&gt;
  12468.  
  12469. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-js&quot; data-lang=&quot;js&quot;&gt;&lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;gulpReactDocs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;require&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;gulp-react-docs&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  12470.  
  12471. &lt;span class=&quot;nx&quot;&gt;gulp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;task&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;docs&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  12472.    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;docsDest&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;docs&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  12473.  
  12474.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;gulp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;./components/**/*.jsx&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12475.        &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;pipe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;gulpReactDocs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  12476.            &lt;span class=&quot;c1&quot;&gt;// the `path` option is used for backlinking the&lt;/span&gt;
  12477.            &lt;span class=&quot;c1&quot;&gt;// output documentation to the source code / file&lt;/span&gt;
  12478.            &lt;span class=&quot;na&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;docsDest&lt;/span&gt;
  12479.        &lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;
  12480.        &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;pipe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;gulp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;docsDest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  12481. &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12482.  
  12483. &lt;p&gt;To take components with inline documentation that looks like this:&lt;/p&gt;
  12484.  
  12485. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-js&quot; data-lang=&quot;js&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;react&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  12486.  
  12487. &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;DataTable&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;createClass&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  12488.  
  12489.    &lt;span class=&quot;na&quot;&gt;propTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  12490.        &lt;span class=&quot;cm&quot;&gt;/**
  12491.         * The columns you want the data table to have. Each column can
  12492.         * have the following attributes:
  12493.         * - `key` **(required)**: column identifier
  12494.         * - `label` **(required)**: Display text for the column. Should
  12495.         *    already be translated when passed to the DataTable.
  12496.         * - `accessor` **(required)**: Function that returns the
  12497.         *   relevant value from a given data item. Later passed to
  12498.         *   the column `render` function.
  12499.         * - `render` **(required)**: Function that takes the output of
  12500.         *   the `accessor` and returns what should be rendered for a
  12501.         *   given data item in that column. Should return either a
  12502.         *   formatted value or can also be html. Columns without
  12503.         *   `render` functions will not be displayed but can be used
  12504.         *   for filtering (see the `filters` prop for more information).
  12505.         * - `textAlignment`: Column is center-aligned by default. Use
  12506.         *   `DataTable.TEXT_ALIGN_LEFT` or `DataTable.TEXT_ALIGN_RIGHT`
  12507.         *   to override the center alignment.
  12508.         * - `widthMultiplier`: Number to multiply the width of the column
  12509.         *   relative to other columns. By default, columns are of equal
  12510.         *   width.
  12511.         * - `adminOnly`: Whether or not this is an admin-only or
  12512.         *   permission-gated column. `adminOnly` columns will only be
  12513.         *   shown if the table&apos;s `isAdmin` prop is `true`.
  12514.         */&lt;/span&gt;
  12515.        &lt;span class=&quot;na&quot;&gt;columns&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;arrayOf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  12516.            &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  12517.                &lt;span class=&quot;na&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;isRequired&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12518.                &lt;span class=&quot;na&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;isRequired&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12519.                &lt;span class=&quot;na&quot;&gt;accessor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;isRequired&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12520.                &lt;span class=&quot;na&quot;&gt;render&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12521.                &lt;span class=&quot;na&quot;&gt;textAlignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;oneOf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;
  12522.                    &lt;span class=&quot;nx&quot;&gt;TEXT_ALIGN_LEFT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12523.                    &lt;span class=&quot;nx&quot;&gt;TEXT_ALIGN_RIGHT&lt;/span&gt;
  12524.                &lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  12525.                &lt;span class=&quot;na&quot;&gt;widthMultiplier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  12526.                &lt;span class=&quot;na&quot;&gt;adminOnly&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;bool&lt;/span&gt;
  12527.            &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  12528.        &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  12529.        &lt;span class=&quot;c1&quot;&gt;// ... more documented props&lt;/span&gt;
  12530.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  12531. &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  12532.  
  12533. &lt;span class=&quot;k&quot;&gt;export&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;DataTable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12534.  
  12535. &lt;p&gt;And output a markdown file that looks like this:&lt;/p&gt;
  12536.  
  12537. &lt;center&gt;
  12538. &lt;img alt=&quot;Rollup Data Table prop documentation&quot; src=&quot;/images/post_images/gulp_react_docs_example_output.png&quot; /&gt;
  12539. &lt;/center&gt;
  12540.  
  12541. &lt;p&gt;For more information on usage see the &lt;a href=&quot;https://www.npmjs.com/package/gulp-react-docs&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gulp-react-docs&lt;/code&gt; plugin page on npm&lt;/a&gt; or our &lt;a href=&quot;https://github.com/AdRoll/gulp-react-docs/tree/master/example&quot;&gt;example gulpfile on GitHub&lt;/a&gt;. Thanks for reading and hope you find this plugin useful!&lt;/p&gt;
  12542.  
  12543. </description>
  12544.    </item>
  12545.    
  12546.    
  12547.    
  12548.    <item>
  12549.      <title>Data Science Event Processing</title>
  12550.      <link>https://tech.nextroll.com/blog/data-science/2015/12/08/data-science_event_processing.html</link>
  12551.      <pubDate>Tue, 08 Dec 2015 00:00:00 -0800</pubDate>
  12552.      <author></author>
  12553.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data-science/2015/12/08/data-science_event_processing</guid>
  12554.      <description>&lt;p&gt;AdRoll uses large-scale machine learning to bid intelligently in internet advertising auctions.
  12555. In this post, we explore some of the engineering behind the data pipelines that feed our learning algorithms,
  12556. in particular our real-time system that constructs features from event streams.&lt;/p&gt;
  12557.  
  12558. &lt;hr /&gt;
  12559.  
  12560. &lt;p&gt;#Overview&lt;/p&gt;
  12561.  
  12562. &lt;p&gt;If you have been served a banner ad on the internet in the past few years, there is a good chance
  12563. that ad was served through a process known as real-time bidding (RTB). As its name suggests, in RTB-based
  12564. ad serving, there is an auction held for the ad space on a web page in real-time every
  12565. time a page is loaded. AdRoll bids in these auctions on behalf of our clients; ad exchanges
  12566. send us a bid request whenever a page is loading, and we respond to these requests with a bid price. There are
  12567. many engaging engineering and data science problems that arise in the context of participating in RTB
  12568. auctions. On AdRoll’s data science team, we mainly concern ourselves with the problem of determining the
  12569. value of impressions intelligently in real-time.&lt;/p&gt;
  12570.  
  12571. &lt;p&gt;At a high level, the solution to this problem is essentially a large-scale machine learning product, BidIQ.
  12572. We learn the probability that particular impressions will click and convert based on historical features
  12573. available at bid time, use our models to forecast this probability for incoming
  12574. bid requests, and then bid as some function of these probabilities and other variables. If you’re interested in our
  12575. modeling and learning algorithms, check out Matt Wilson’s post on &lt;a href=&quot;http://tech.adroll.com/blog/data-science/2015/08/25/factorization-machines.html&quot; title=&quot;Factorization Machines&quot;&gt;Factorization Machines&lt;/a&gt;.&lt;/p&gt;
  12576.  
  12577. &lt;p&gt;There are many features that we feed into our learning algorithm. One major source is those derived from the bid
  12578. request. These include features such as geo-location, ad network, inventory source, the customer that we would
  12579. serve this impression to, time, among many others. Another source of very valuable information is features related
  12580. to the cookie that this impression would be served to, such as past behavior of that user. Enter our event-processing system, internally dubbed chocolate chip (cookies!), which computes such features and makes them available at
  12581. bid time.&lt;/p&gt;
  12582.  
  12583. &lt;hr /&gt;
  12584.  
  12585. &lt;p&gt;#Chocolate Chip&lt;/p&gt;
  12586.  
  12587. &lt;p&gt;One of the challenges of using data on individual cookies (profiles) in real-time is that, unless the profile
  12588. is stored in the cookie, it is not available in a bid request. This necessitates the use of a key-value store
  12589. mapping cookies to profiles of features. There are costs and benefits to both approaches. Storing data in the
  12590. cookie guarantees truly real-time data, whereas key-value stores allows greater flexiblity with offline testing
  12591. and data munging. We ultimately decided on the latter approach. For our backend key-value store,
  12592. we use AWS DynamoDB, which we have used with great success for similar use cases in the past.&lt;/p&gt;
  12593.  
  12594. &lt;p&gt;At its core, chocolate chip is a system that takes events, processes them, and then writes profiles to
  12595. DynamoDB. Although originally designed and built by the data science team, it has found other uses throughout
  12596. AdRoll. Our use case on the data science team is to build predictive features from events in data streams and write
  12597. these profiles of features in as close to real-time as possible. These profiles are then fetched by our bidders
  12598. and fed into our machine learning model to produce more accurate click and conversion predictions.
  12599. In order to learn on these data, the profile at the time of the bid is logged. Chocolate chip was built from
  12600. the ground up in &lt;a href=&quot;http://dlang.org&quot; title=&quot;D Programming Language&quot;&gt;D&lt;/a&gt;, including our own high-performance DynamoDB client.&lt;/p&gt;
  12601.  
  12602. &lt;p&gt;The high-level concept of building a profile is simple: read the current value of the profile in DynamoDB,
  12603. update this profile with information on the current event, and write this profile back. This, however, is
  12604. complicated due to the fact that in order to achieve the scale and performance needed at AdRoll, the system is
  12605. highly parallelised and supports several methods by which to distribute the computation. Additionally, we
  12606. run computations in mini-batches to reduce the number of HTTP requests and increase throughput.&lt;/p&gt;
  12607.  
  12608. &lt;hr /&gt;
  12609.  
  12610. &lt;p&gt;#Architecture&lt;/p&gt;
  12611.  
  12612. &lt;p&gt;Chocolate chip ingests multiple streams of data, each stream comprised of a distinct event type. Each event
  12613. is in the form of an AdRoll log line, a unified log line format over all our event types. The cookie is then
  12614. handed off to one of several worker threads based on the cookie hash and event type. One of the main
  12615. considerations when dealing with concurrent read and writes is always going to be avoiding data races. In
  12616. particular, we never want more than one concurrent read-build-write cycle per cookie to DynamoDB, otherwise the
  12617. events processed by one of the cycles will be overwritten by the other. By having each worker thread only deal
  12618. with a partition of the cookie space, we eliminate this race condition. This race is exacerbated for us because
  12619. cookies tend to be active at certain times, and therefore have events close together in the event stream. If we
  12620. did not break up by cookies, it is likely multiple workers could deal with events from the same cookie
  12621. concurrently. Although sharding by cookie does scale over several nodes, it is not our primary form of sharding
  12622. as each of the data streams themselves are not broken up by cookie, and hence would have to be replicated.&lt;/p&gt;
  12623.  
  12624. &lt;p&gt;When scaling out chocolate chip, the primary mechanism is to shard by event type. This exploits DynamoDB
  12625. hash-and-range key type, with the primary key being the cookie hash and the range key being the event type. Using
  12626. this property, we can break out separate nodes to deal with distinct event streams and build profiles for
  12627. different event types on separate subrange keys. Our input event streams do not guarantee
  12628. event ordering, so any features we build from events must be commutative. With the hash-and-range key method,
  12629. we’re also restricted to associative events. This basically means we can only build features which, even
  12630. if partially constructed from separate events stream, can be reconstructed to their true value. While this adds
  12631. some complexity at bid time, we have yet to find a feature we’d like to build that is commutative but not
  12632. associative, so sharding out the building of features by event has not impacted what features we construct.&lt;/p&gt;
  12633.  
  12634. &lt;p&gt;Once a worker thread reads an event from its queue, it stores this event until it has events accumulated from
  12635. several distinct cookies. DynamoDB permits batch read and write requests, the latter having a batch size of 25, so
  12636. this is the size we use. Once 25 distinct cookies have been registered (perhaps with multiple events per cookie),
  12637. a batch read request is made to the master region, the profiles returned are updated or generated if nonexistent,
  12638. and then written back and replicated to slave regions when there is available capacity.&lt;/p&gt;
  12639.  
  12640. &lt;hr /&gt;
  12641.  
  12642. &lt;p&gt;#Example feature: Exponential Moving Average&lt;/p&gt;
  12643.  
  12644. &lt;p&gt;One can imagine that for some event types, recent events are more valuable for learning than past ones.
  12645. A mechanism we use to capture such information is the exponential moving average. To understand
  12646. why this type of feature is used, let us first examine the simple moving average. Say we care about the amount
  12647. of ads we have shown to a particular cookie in the past hour (such a feature could be phrased as a simple moving
  12648. average). In order to calulate this feature, we would have to store the timestamps of all the impressions shown,
  12649. and count those in the past hour at bid time. The generation of the feature can lazily eliminate events over an
  12650. hour old, but the key here is that some information about all the events must be stored.&lt;/p&gt;
  12651.  
  12652. &lt;p&gt;In the low-latency environment under which this system operates, the memory costs of this feature are prohibitive.
  12653. At bid time, we must fetch the the profile from DynamoDB over a TCP connection and the latency of fetching the
  12654. feature is proportional to the number of TCP packets needed to communicate the profile. If we store many
  12655. features like this, the number of packets needed and, therefore, latency would be unacceptable.&lt;/p&gt;
  12656.  
  12657. &lt;p&gt;This is where the exponential moving average (EMA) comes in. EMAs can store a notion of number of events that
  12658. occurred recently in a fixed amount of memory.&lt;/p&gt;
  12659.  
  12660. &lt;p&gt;The formal defintion of an EMA is:&lt;/p&gt;
  12661.  
  12662. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  12663. %&lt;![CDATA[
  12664. \begin{equation}
  12665. \hat{X}_{n,\alpha}\, = (1-\alpha) \sum_{i=1}^n \alpha^{n-i}x_i
  12666. \end{equation}
  12667. %]]&gt;
  12668. &lt;/script&gt;
  12669.  
  12670. &lt;p&gt;With a little math, we can show that this is equivalent to following recursive definition:&lt;/p&gt;
  12671.  
  12672. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  12673. %&lt;![CDATA[
  12674. \begin{equation}
  12675. \hat{X}_{0,\alpha}\, = 0
  12676. \end{equation}
  12677. %]]&gt;
  12678. &lt;/script&gt;
  12679.  
  12680. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  12681. %&lt;![CDATA[
  12682. \begin{equation}
  12683. \hat{X}_{n+1,\alpha}\, = (1-\alpha)x_i + \alpha \hat{X}_{n,\alpha}
  12684. \end{equation}
  12685. %]]&gt;
  12686. &lt;/script&gt;
  12687.  
  12688. &lt;p&gt;And there we have it. All we need to calculate the current value of the EMA is that last value, the timestamp,
  12689. and the current timestamp. An observant reader will also note that we can add in events out of order to the EMA,
  12690. as we just need to apply an exponential decay factor to the event. We can build secondary features for free
  12691. by using the EMA feature; we have to store the timestamp of the last event, so we can extract this and build other
  12692. features out of it.&lt;/p&gt;
  12693.  
  12694. &lt;p&gt;In our case, applying this to impression count, we can think of each passing second as an observation, which
  12695. takes the value 1 if an impressions occured in that second, or a 0 if not. Then we can say that:&lt;/p&gt;
  12696.  
  12697. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  12698. %&lt;![CDATA[
  12699. \begin{equation}
  12700. \hat{X}_{T_{n+1},\alpha}\, = (1-\alpha) + \alpha^{T_{n+1} - T_n} \hat{X}_{T_n,\alpha}
  12701. \end{equation}
  12702. %]]&gt;
  12703. &lt;/script&gt;
  12704.  
  12705. &lt;p&gt;where
  12706. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  12707. %&amp;lt;![CDATA[
  12708. \begin{equation}
  12709. T_i
  12710. \end{equation}
  12711. %]]&amp;gt;
  12712. &lt;/script&gt;&lt;/p&gt;
  12713.  
  12714. &lt;p&gt;refers to the timestamp of the ith event.&lt;/p&gt;
  12715.  
  12716. &lt;p&gt;The EMA described is essentially just a counter that decays over time. While we cannot guarantee the count of
  12717. events in the past hour, we get a feature which takes a high value when there are a large number of recent events,
  12718. and a low value when there are not.&lt;/p&gt;
  12719.  
  12720. &lt;hr /&gt;
  12721.  
  12722. &lt;p&gt;#Performance considerations in D&lt;/p&gt;
  12723.  
  12724. &lt;p&gt;As mentioned earlier, this system was built from the ground up in the &lt;a href=&quot;http://dlang.org&quot; title=&quot;D Programming Language&quot;&gt;D programming language&lt;/a&gt;. Rather than plugging
  12725. a &lt;a href=&quot;http://storm.apache.org&quot; title=&quot;Apache Storm&quot;&gt;storm bolt&lt;/a&gt; into our existing data structure, this allows code sharing with our existing machine learning
  12726. infrastructure, and our already-developed in-house D expertise. An added benefit is our ability to
  12727. optimize all the areas of our stack, rather than just the feature generation layer.&lt;/p&gt;
  12728.  
  12729. &lt;p&gt;With systems such as these, memory management is also going to be a major factor in performance. We found when
  12730. we profiled our code that the primary source of contention on a single machine was the global
  12731. lock that occurs around everything in D’s GC allocator. By restructuring our code to mostly use preallocated
  12732. buffers for computation, and by using a specialized JSON serializer and parser, we are able to achieve single
  12733. machine throughput of around 20,000 events per second on a 32 hyperthread instance. Another optimization we found
  12734. was that D’s GC seems to have poor heuristics for when to enter a GC cycle for this application. By disabling
  12735. automatic GC collections and running GC.collect() and GC.minimize() according to our own heuristics, we are able
  12736. to achieve 2-3x higher throughput. The runtime in D has been getting much love in recent releases, and we hope
  12737. that such measures will not be necessary in the future.&lt;/p&gt;
  12738.  
  12739. &lt;p&gt;Ultimately, we didn’t really have to do anything fancy or micro-optimize to get solid performance out of the
  12740. system. A significant advantage of this solid performance is that we have not had to build chocolate chip as a
  12741. truly distributed system, and as such, we’ve had very good uptime and low maintenance costs. As a side note, for
  12742. performance reasons, we also use unique memory management techniques in our prediction server for real-time ad
  12743. pricing. That is a truly real-time system with millisecond SLAs and cannot afford any GC cycles.&lt;/p&gt;
  12744.  
  12745. &lt;hr /&gt;
  12746.  
  12747. &lt;p&gt;Chocolate chip has added tremendous predictive power and accuracy to our learning algorithms and has
  12748. drastically improved how intelligently we bid on exchanges. In the recent release of an EMA feature as outlined
  12749. above, we found cost-per-click to drop by 2% with no effect on reach.&lt;/p&gt;
  12750.  
  12751. </description>
  12752.    </item>
  12753.    
  12754.    
  12755.    
  12756.    <item>
  12757.      <title>Rollup: What we have learned from sharing UI code at AdRoll</title>
  12758.      <link>https://tech.nextroll.com/blog/frontend/2015/11/19/rollup-major-learnings.html</link>
  12759.      <pubDate>Thu, 19 Nov 2015 00:00:00 -0800</pubDate>
  12760.      <author></author>
  12761.      <guid isPermaLink="false">https://tech.nextroll.com/blog/frontend/2015/11/19/rollup-major-learnings</guid>
  12762.      <description>&lt;p&gt;&lt;em&gt;This is the third post in a series of three blog posts about Rollup, AdRoll’s UI component library. This post covers what we learned from building a UI component library. For details on why we built the UI component library see the &lt;a href=&quot;/blog/frontend/2015/11/05/rollup-shared-ui-components.html&quot;&gt;first post&lt;/a&gt; in the series and for how we built it see the &lt;a href=&quot;/blog/frontend/2015/11/12/rollup-react-and-npm-at-adroll.html&quot;&gt;second post&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
  12763.  
  12764. &lt;p&gt;At end of Rollup component and developer tool development, we reflected on what we did and realized that we had learned four important lessons from the work:&lt;/p&gt;
  12765.  
  12766. &lt;ul&gt;
  12767.  &lt;li&gt;&lt;a href=&quot;#making-reusable-components-for-app-developers-is-hard&quot;&gt;Making reusable components for app developers is hard&lt;/a&gt;&lt;/li&gt;
  12768.  &lt;li&gt;&lt;a href=&quot;#making-reusable-components-for-contributors-is-hard&quot;&gt;Making reusable components for contributors is hard&lt;/a&gt;&lt;/li&gt;
  12769.  &lt;li&gt;&lt;a href=&quot;#interfaces-are-hard&quot;&gt;Interfaces are hard&lt;/a&gt;&lt;/li&gt;
  12770.  &lt;li&gt;&lt;a href=&quot;#making-everything-look-the-same-is-hard&quot;&gt;Making everything look the same is hard&lt;/a&gt;&lt;/li&gt;
  12771. &lt;/ul&gt;
  12772.  
  12773. &lt;p&gt;Let’s dig into each one separately.&lt;/p&gt;
  12774.  
  12775. &lt;h2 id=&quot;making-reusable-components-for-app-developers-is-hard&quot;&gt;Making reusable components for app developers is hard&lt;/h2&gt;
  12776.  
  12777. &lt;p&gt;When we first started working on Rollup, we wanted to make sure that individual components would be easy to use for any developer on any team. For app developers, we wanted them to be able to use Rollup components even if:&lt;/p&gt;
  12778.  
  12779. &lt;ul&gt;
  12780.  &lt;li&gt;they’re new to JavaScript&lt;/li&gt;
  12781.  &lt;li&gt;they don’t use React in their application&lt;/li&gt;
  12782.  &lt;li&gt;they don’t have fancy build tools, e.g. &lt;a href=&quot;gulpjs.com&quot;&gt;Gulp&lt;/a&gt; or &lt;a href=&quot;http://browserify.org/&quot;&gt;Browserify&lt;/a&gt;&lt;/li&gt;
  12783.  &lt;li&gt;they want to use the same version of a component forever&lt;/li&gt;
  12784.  &lt;li&gt;we never thought about their use case&lt;/li&gt;
  12785. &lt;/ul&gt;
  12786.  
  12787. &lt;p&gt;To address the first three points, each published component has a CDN build. The CDN build, among other things, contains transpiled code. The transpiled JS means that other teams do not need to have a fancy build tool to use a component in their project and that app developers do not need to use &lt;a href=&quot;https://facebook.github.io/react/docs/jsx-in-depth.html&quot;&gt;JSX&lt;/a&gt; in order to add the components to their application.&lt;/p&gt;
  12788.  
  12789. &lt;p&gt;Not only does this mean that components can be used in projects not built in &lt;a href=&quot;https://facebook.github.io/react/&quot;&gt;React&lt;/a&gt;, but also makes it easier to use for those who are new to JavaScript. The CDN build accomplishes this by allowing them to use whichever framework they feel most comfortable with (e.g. jQuery or plain JavaScript).&lt;/p&gt;
  12790.  
  12791. &lt;p&gt;To allow app developers to use the same version of a component forever, components are published to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; according to &lt;a href=&quot;http://semver.org/&quot;&gt;SemVer&lt;/a&gt; and the CDN assets are pushed to S3 under a versioned URL. Other applications using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; can leverage &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt;’s dependency management to use the same version forever. While other app developers loading components via the CDN can always load the same version through the versioned URL.&lt;/p&gt;
  12792.  
  12793. &lt;p&gt;We also want app developers to be able to use the components even if we had not thought about their use case. We can accomplish this in two ways. The first way is to provide well defined and general interfaces for the components (see &lt;a href=&quot;#interfaces-are-hard&quot;&gt;interfaces are hard&lt;/a&gt; for more details).&lt;/p&gt;
  12794.  
  12795. &lt;p&gt;The second way we want to accomplish is through open communication channels. Even if interfaces are well-defined and general, we want to be able to customize and iterate on them quickly. In order to accomplish this we have created a &lt;a href=&quot;https://slack.com/&quot;&gt;Slack&lt;/a&gt; channel, #frontend_helpdesk, dedicated to frontend and Rollup help. We track all bugs and feature requests received there using GitHub issues.&lt;/p&gt;
  12796.  
  12797. &lt;p&gt;By keeping the lines of communications open and judiciously tracking feature requests and bug fixes, we aim to quickly address new use cases. The issue tracking allows us to track and implement the needed fixes ourselves, and serves as a place where we can foster further conversation on how to support a new use case.&lt;/p&gt;
  12798.  
  12799. &lt;h2 id=&quot;making-reusable-components-for-contributors-is-hard&quot;&gt;Making reusable components for contributors is hard&lt;/h2&gt;
  12800.  
  12801. &lt;p&gt;The developer tools we put in place surrounding the components aim to make it easier to contribute to the component library. The goal of the tooling was to enable Rollup contributors to:&lt;/p&gt;
  12802.  
  12803. &lt;ul&gt;
  12804.  &lt;li&gt;&lt;a href=&quot;#focus-on-the-important-things-when-developing-a-component&quot;&gt;focus on the important things when developing a component&lt;/a&gt;&lt;/li&gt;
  12805.  &lt;li&gt;&lt;a href=&quot;#focus-on-the-important-things-when-reviewing-a-component&quot;&gt;focus on the important things when reviewing a component&lt;/a&gt;&lt;/li&gt;
  12806. &lt;/ul&gt;
  12807.  
  12808. &lt;h4 id=&quot;focus-on-the-important-things-when-developing-a-component&quot;&gt;Focus on the important things when developing a component&lt;/h4&gt;
  12809.  
  12810. &lt;p&gt;To develop and publish a new component, contributors should not have to worry about boilerplate like build configuration and file structure. To that end, we built a Rollup component generator using &lt;a href=&quot;http://yeoman.io/generators/&quot;&gt;Yeoman&lt;/a&gt;. The generator takes the name of the component and then creates all files needed for a &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/tree/9a6f1e7df1965b342066ca029cc8cce0a3f41295&quot;&gt;blank component&lt;/a&gt;. After running the generator, contributors can just hit the ground running.&lt;/p&gt;
  12811.  
  12812. &lt;p&gt;Each component has an &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/tree/9a6f1e7df1965b342066ca029cc8cce0a3f41295/examples&quot;&gt;examples/&lt;/a&gt; directory and a Gulp task for watching the changes made to the component and reloading the example in the browser. This makes it easy for contributors to test the functionality that they are working on in an isolated development and removes the hassle of having to set up a test application to interact with the component.&lt;/p&gt;
  12813.  
  12814. &lt;h4 id=&quot;focus-on-the-important-things-when-reviewing-a-component&quot;&gt;Focus on the important things when reviewing a component&lt;/h4&gt;
  12815.  
  12816. &lt;p&gt;We also want reviewers of pull requests in the Rollup repo to be able to focus on the important things when looking over someone else’s work. To that end, we have set up a global linter and &lt;a href=&quot;https://facebook.github.io/jest/&quot;&gt;Jest&lt;/a&gt; tests. The linting and test suite are automatically run on every single PR by &lt;a href=&quot;https://jenkins-ci.org/&quot;&gt;Jenkins&lt;/a&gt;. The reviewer no longer needs to remember to do these checks, since it is done for them.&lt;/p&gt;
  12817.  
  12818. &lt;p&gt;Instead, the reviewer can focus on the code quality and changes to component interfaces. In addition and thanks to each component’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;examples/&lt;/code&gt; directory, reviewers can also interact with the component and verify functionality, just like the component’s author(s) can.&lt;/p&gt;
  12819.  
  12820. &lt;h2 id=&quot;interfaces-are-hard&quot;&gt;Interfaces are hard&lt;/h2&gt;
  12821.  
  12822. &lt;p&gt;When starting work on Rollup and trying to use these components in other applications, we quickly realized that it mattered how easy it was to integrate a component. Enter the interface. Here are some of the best practices we learned while building them:&lt;/p&gt;
  12823.  
  12824. &lt;ul&gt;
  12825.  &lt;li&gt;Limit the number of files that need to be included to get a component to work. One JS and one SCSS file is ideal.&lt;/li&gt;
  12826.  &lt;li&gt;Make sure component and &lt;a href=&quot;https://facebook.github.io/react/docs/reusable-components.html&quot;&gt;React prop&lt;/a&gt; names make sense. For example, a click handler should have the word &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;click&lt;/code&gt; in it somewhere.&lt;/li&gt;
  12827.  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prop&lt;/code&gt; definitions that allow develops to achieve the most types of interactions with the least number of props.&lt;/li&gt;
  12828. &lt;/ul&gt;
  12829.  
  12830. &lt;p&gt;The first point hopes to limit the pain for app developers integrating components into their applications.&lt;/p&gt;
  12831.  
  12832. &lt;p&gt;The second and third points together form the essence of what we learned when designing component interfaces. The idea behind these two points is to encourage contributors to limit the number of &lt;a href=&quot;https://facebook.github.io/react/docs/reusable-components.html&quot;&gt;React props&lt;/a&gt; each component needs to be minimally functional, but be careful to define each prop in such a way that also supports more complicated use cases at the same time.&lt;/p&gt;
  12833.  
  12834. &lt;p&gt;To help explain what we mean by that, let’s take a look at the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;columns&lt;/code&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prop&lt;/code&gt; for our data table component:&lt;/p&gt;
  12835.  
  12836. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-js&quot; data-lang=&quot;js&quot;&gt;&lt;span class=&quot;nx&quot;&gt;columns&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;
  12837.    &lt;span class=&quot;na&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;
  12838.    &lt;span class=&quot;na&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;
  12839.    &lt;span class=&quot;na&quot;&gt;accessor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Function&lt;/span&gt;
  12840.    &lt;span class=&quot;na&quot;&gt;render&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Function&lt;/span&gt;
  12841.    &lt;span class=&quot;na&quot;&gt;textAlignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;TEXT_ALIGN_LEFT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;TEXT_ALIGN_RIGHT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  12842.    &lt;span class=&quot;na&quot;&gt;widthMultiplier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Number&lt;/span&gt;
  12843.    &lt;span class=&quot;na&quot;&gt;adminOnly&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Boolean&lt;/span&gt;
  12844. &lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  12845.  
  12846. &lt;p&gt;Let’s take a look at the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accessor&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;render&lt;/code&gt; attributes that each column can have. In addition to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;columns&lt;/code&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prop&lt;/code&gt;, the table also expects an array of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;data&lt;/code&gt;. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accessor&lt;/code&gt; in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;columns&lt;/code&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prop&lt;/code&gt; is a function, that when given a data item will return a value that is then fed to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;render&lt;/code&gt; function for that same column.&lt;/p&gt;
  12847.  
  12848. &lt;p&gt;Together, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accessor&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;render&lt;/code&gt; support the simple use case of simply displaying information for a given data item without the data table component having to know the internal organization of each item.&lt;/p&gt;
  12849.  
  12850. &lt;p&gt;However, because the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accessor&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;render&lt;/code&gt; attributes are functions, they also support more complicated use cases. For example, these two attributes on a given column can be used to render line charts that summarize data for a particular data item. To do that, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accessor&lt;/code&gt; could return an array of data points, while the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;render&lt;/code&gt; can return anything as long as &lt;a href=&quot;https://facebook.github.io/react/&quot;&gt;React&lt;/a&gt; can render it. Allowing the app developer using the component to return JSX or a React element for the line chart. For example:&lt;/p&gt;
  12851.  
  12852. &lt;center&gt;
  12853. &lt;img alt=&quot;Rollup Data Table complicated use case&quot; src=&quot;/images/post_images/rollup_data_table_use_case.png&quot; /&gt;
  12854. &lt;/center&gt;
  12855.  
  12856. &lt;p&gt;In the above example, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accessor&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;render&lt;/code&gt; functions are general enough to support straightforward uses of the data table and more complicated ones as well.&lt;/p&gt;
  12857.  
  12858. &lt;h2 id=&quot;making-everything-look-the-same-is-hard&quot;&gt;Making everything look the same is hard&lt;/h2&gt;
  12859.  
  12860. &lt;p&gt;Making things look the same across products and components developed in isolation is not an easy feat. For example, the blue used in the date picker needs to be the same blue that’s used in our data table. As another example, the top navbar needs to be pixel perfect across products, so that the end user feels like they are navigating within the same application, even if that’s not how the product is implemented.&lt;/p&gt;
  12861.  
  12862. &lt;p&gt;To solve this problem, we came up with a Rollup component that maintains styles shared by applications and components. This component is called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-style-base&lt;/code&gt;, and all common styles are built into it:&lt;/p&gt;
  12863.  
  12864. &lt;ul&gt;
  12865.  &lt;li&gt;Common colors are built in as &lt;a href=&quot;http://sass-lang.com/&quot;&gt;Sass&lt;/a&gt; variables&lt;/li&gt;
  12866.  &lt;li&gt;Common icons and typography are also defined for components and different products&lt;/li&gt;
  12867.  &lt;li&gt;We also include bootstrap customizations in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-style-base&lt;/code&gt; since many of our UI components use &lt;a href=&quot;http://react-bootstrap.github.io/&quot;&gt;React Bootstrap&lt;/a&gt;&lt;/li&gt;
  12868. &lt;/ul&gt;
  12869.  
  12870. &lt;p&gt;Even though the base styles can be overridden, they provide a good starting point for new components and products that need to share colors, typography, and iconography.&lt;/p&gt;
  12871.  
  12872. &lt;h2 id=&quot;so-whats-next&quot;&gt;So what’s next?&lt;/h2&gt;
  12873.  
  12874. &lt;p&gt;We continue to encourage the adoption of our components at AdRoll and share the things we have learned. Not only will sharing internally the things we’ve learned help grow our contributor pool, but it will also help grow the frontend expertise of the company as a whole.&lt;/p&gt;
  12875.  
  12876. &lt;p&gt;As we mentioned at the end of the &lt;a href=&quot;/blog/frontend/2015/11/05/rollup-shared-ui-components.html&quot;&gt;first blog post in this series&lt;/a&gt;, we decided to build our own components since we were not happy with open-source solutions. So, the next big thing we have in mind for Rollup is to open source them one at a time.&lt;/p&gt;
  12877.  
  12878. &lt;p&gt;Thanks for reading this series and we hope you have enjoyed what we have to share about Rollup!&lt;/p&gt;
  12879. </description>
  12880.    </item>
  12881.    
  12882.    
  12883.    
  12884.    <item>
  12885.      <title>
  12886. Build a simple distributed system using AWS Lambda, Python, and DynamoDB
  12887. </title>
  12888.      <link>https://tech.nextroll.com/blog/dev/2015/11/16/count-things-with-aws-lambda-python-and-dynamodb.html</link>
  12889.      <pubDate>Mon, 16 Nov 2015 00:00:00 -0800</pubDate>
  12890.      <author></author>
  12891.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2015/11/16/count-things-with-aws-lambda-python-and-dynamodb</guid>
  12892.      <description>&lt;style type=&quot;text/css&quot;&gt;/*&lt;![CDATA[*/
  12893.  
  12894.  .highlight code { background-color: #272822; }
  12895.  .highlight code.sh .nb { color: #E6DB74; }
  12896.  .highlight code.sh .nv { color: #4DFFFF; }
  12897.  .highlight code.sh .s2 { color: #FDA533; }
  12898.  .highlight code.sh .s1 { color: #FFA52F; }
  12899.  .highlight code.sh  .s { color: #FBE72E; }
  12900.  .highlight code.sh  .k { color: #61FF70; }
  12901.  .highlight code.sh  .c { color: #FED633; }
  12902.  .highlight code.sh  .o { color: #5CEA26; }
  12903.  
  12904.  .highlight code.json .nt { color: #30F926; }
  12905.  .highlight code.json .mi { color: #00FDFD; }
  12906.  .highlight code.json .s2 { color: #FDA533; }
  12907.  .highlight code.json .kc { color: #FFFC49; }
  12908.  .highlight code.json  .p { color: #AE81FF; }
  12909.  
  12910.  .highlight code.python { background-color: #315050; }
  12911.  .highlight code.python  .c { color: #FED633; }
  12912.  .highlight code.python  .n { color: #CBBA77; }
  12913.  .highlight code.python  .o { color: #CBBA77; }
  12914.  .highlight code.python  .p { color: #CBBA77; }
  12915.  .highlight code.python  .s { color: #FD9226; }
  12916.  .highlight code.python .nn { color: #CBBA77; }
  12917.  .highlight code.python .kn { color: #4DFFFF; }
  12918.  .highlight code.python  .k { color: #4DFFFF; }
  12919.  .highlight code.python .ow { color: #2DFFFE; }
  12920.  .highlight code.python .nb { color: #29F89D; }
  12921.  .highlight code.python .nf { color: #29F89D; }
  12922.  .highlight code.python .bp { color: #FEA07E; }
  12923.  .highlight code.python .mi { color: #CBBA77; }
  12924.  
  12925.  table.cost-comparison thead { background-color: #BDE4F3; }
  12926.  table.cost-comparison tbody { background-color: #D7F2FB; }
  12927.  table.cost-comparison tbody tr:nth-child(even) { background-color: #D0F0F0; }
  12928.  table.cost-comparison td, th { padding-right: 0.5em; padding-left: 0.5em; }
  12929.  table.cost-comparison td ~ td, th ~ th { padding-left: 1em; }
  12930.  
  12931.  table.cost-summary td, th { padding-right: 0.5em; padding-left: 0.5em; }
  12932.  table.cost-summary td ~ td, th ~ th { padding-left: 1em; }
  12933.  table.cost-summary td:nth-child(1) { background-color: #BDE4F3; font-weight: bold; }
  12934.  table.cost-summary td:nth-child(1)::after { content: &quot;:&quot;; }
  12935.  table.cost-summary td:nth-child(2) { background-color: #D7F2FB; }
  12936.  
  12937.  table.math-legend td, th { padding-right: 0.5em; padding-left: 0.5em; }
  12938.  table.math-legend td ~ td, th ~ th { padding-left: 1em; }
  12939.  table.math-legend td:nth-child(1) { background-color: #BDE4F3; font-weight: bold; }
  12940.  table.math-legend td:nth-child(1)::after { content: &quot;:&quot;; }
  12941.  table.math-legend td:nth-child(2) { background-color: #D7F2FB; }
  12942.  table.math-legend { margin-top: -1em; margin-left: auto; margin-right: auto; margin-bottom: 1em; }
  12943.  
  12944. /*]]&gt;*/&lt;/style&gt;
  12945.  
  12946. &lt;p&gt;We have implemented a number of systems in support of our &lt;a href=&quot;http://www.erlang.org/&quot;&gt;Erlang&lt;/a&gt;-based
  12947.  real-time bidding platform.  One of these is a &lt;a href=&quot;http://www.celeryproject.org/&quot;&gt;Celery&lt;/a&gt; task system which runs
  12948.  code implemented in Python on a set of worker instances running on &lt;a href=&quot;https://aws.amazon.com/ec2/&quot;&gt;Amazon
  12949.  EC2&lt;/a&gt;.&lt;/p&gt;
  12950.  
  12951. &lt;p&gt;With the &lt;a href=&quot;https://aws.amazon.com/blogs/aws/aws-lambda-update-python-vpc-increased-function-duration-scheduling-and-more/&quot;&gt;recent announcement&lt;/a&gt; of built-in support for Python
  12952.  in &lt;a href=&quot;http://aws.amazon.com/lambda/&quot;&gt;AWS Lambda&lt;/a&gt; functions (and upcoming access to VPC resources from
  12953.  Lambda), we’ve started considering increased use of Lambda for a number of applications.&lt;/p&gt;
  12954.  
  12955. &lt;p&gt;In this post, we’ll present a complete example of a data aggregation system using
  12956.  Python-based Lambda functions, S3 events, and DynamoDB triggers; and configured using
  12957.  the AWS command-line tools (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;awscli&lt;/code&gt;) wherever possible.  &lt;em&gt;(Note: Some of these steps
  12958.  are handled automatically when using the AWS console.)&lt;/em&gt;&lt;/p&gt;
  12959.  
  12960. &lt;p&gt;Here’s how our completed system will look:&lt;/p&gt;
  12961.  
  12962. &lt;p&gt;&lt;img src=&quot;/images/post_images/lambda-counter-aggregator-overview.png&quot; alt=&quot;Lambda-based counter aggregation system overview, image made with draw.io&quot; /&gt;&lt;/p&gt;
  12963.  
  12964. &lt;h2 id=&quot;why&quot;&gt;Why?&lt;/h2&gt;
  12965.  
  12966. &lt;p&gt;Each of our EC2 instances participating in RTB maintains a set of counter values: these
  12967.  represent the health of an important aspect of our platform.  Each instance reports
  12968.  these values using Kinesis, and also periodically uploads its set of counters to a
  12969.  per-instance key on S3 as part of a failsafe system.  When we aggregate these counter
  12970.  values, we have greater insight into whether our system as a whole is healthy (and we
  12971.  can take action automatically when it isn’t).  This post will focus on a system which
  12972.  can process the data we upload to S3.&lt;/p&gt;
  12973.  
  12974. &lt;p&gt;Here’s what we actually want to do with our per-instance counters:&lt;/p&gt;
  12975.  
  12976. &lt;ol&gt;
  12977.  &lt;li&gt;
  12978.    &lt;p&gt;Have an up-to-date global view of total summed counter values across all instances.&lt;/p&gt;
  12979.  &lt;/li&gt;
  12980.  &lt;li&gt;
  12981.    &lt;p&gt;Take action whenever the summed counter values change.&lt;/p&gt;
  12982.  &lt;/li&gt;
  12983. &lt;/ol&gt;
  12984.  
  12985. &lt;h3 id=&quot;current-solution&quot;&gt;Current solution&lt;/h3&gt;
  12986.  
  12987. &lt;p&gt;One of our Celery tasks &lt;em&gt;(A)&lt;/em&gt; implements counter aggregation, with the output consumed
  12988.  by a periodic downstream task &lt;em&gt;(B)&lt;/em&gt; implementing our business logic.  Every few minutes,
  12989.  task &lt;em&gt;A&lt;/em&gt; does the following:&lt;/p&gt;
  12990.  
  12991. &lt;ol&gt;
  12992.  &lt;li&gt;
  12993.    &lt;p&gt;Scans an S3 bucket prefix for keys having a certain naming convention.  &lt;em&gt;(Each of
  12994.  these keys represents counter data uploaded by a single EC2 instance, and each key name
  12995.  includes the current date and the ID of the instance which uploaded it.  An instance
  12996.  usually writes to the same key.)&lt;/em&gt;&lt;/p&gt;
  12997.  &lt;/li&gt;
  12998.  &lt;li&gt;
  12999.    &lt;p&gt;Reads the contents of all these keys, which have a format like the following:&lt;/p&gt;
  13000.  
  13001.    &lt;pre&gt;
  13002. COUNTER1 VALUE1
  13003. COUNTER2 VALUE2
  13004. COUNTER3 VALUE3&lt;/pre&gt;
  13005.  &lt;/li&gt;
  13006.  &lt;li&gt;
  13007.    &lt;p&gt;Sums all the values for each counter.  &lt;em&gt;(The same counter can appear in multiple
  13008.  files.)&lt;/em&gt;&lt;/p&gt;
  13009.  &lt;/li&gt;
  13010.  &lt;li&gt;
  13011.    &lt;p&gt;Writes the summed counter values to a single S3 key for use in task &lt;em&gt;B&lt;/em&gt;, which
  13012.  can then take action on the aggregate counter values.&lt;/p&gt;
  13013.  &lt;/li&gt;
  13014. &lt;/ol&gt;
  13015.  
  13016. &lt;h3 id=&quot;drawbacks-of-current-solution&quot;&gt;Drawbacks of current solution&lt;/h3&gt;
  13017.  
  13018. &lt;p&gt;We’re essentially polling S3 to detect file changes, and we’re also reprocessing data
  13019.  which has not necessarily changed.  This is time-consuming and ties up resources which
  13020.  could have been allocated elsewhere.&lt;/p&gt;
  13021.  
  13022. &lt;p&gt;It also would be nice to react more quickly to updated counter values.  With our current
  13023.  implementation, there can be significant lag between a counter data upload and actually
  13024.  observing its contribution to the global aggregate value.&lt;/p&gt;
  13025.  
  13026. &lt;h2 id=&quot;enter-lambda&quot;&gt;Enter Lambda&lt;/h2&gt;
  13027.  
  13028. &lt;p&gt;Lambda promises scalable, serverless, event-based code execution with granular billing.
  13029.  This is a compelling value proposition for the following reasons:&lt;/p&gt;
  13030.  
  13031. &lt;ul&gt;
  13032.  &lt;li&gt;
  13033.    &lt;p&gt;Standalone code bundles are the unit of deployment;&lt;/p&gt;
  13034.  &lt;/li&gt;
  13035.  &lt;li&gt;
  13036.    &lt;p&gt;We can run our code while only paying for the time spent actually executing it instead
  13037. of spending money on idle CPU time;&lt;/p&gt;
  13038.  &lt;/li&gt;
  13039.  &lt;li&gt;
  13040.    &lt;p&gt;We can reduce the peak load experienced by our task system (by migrating certain
  13041. frequent and spiky tasks to Lambda), allowing us to reduce the amount of resources
  13042. allocated to it while increasing utilization of the remaining resources;&lt;/p&gt;
  13043.  &lt;/li&gt;
  13044.  &lt;li&gt;
  13045.    &lt;p&gt;Our counter system will perform its work in a more timely manner by responding to
  13046. events instead of scanning S3, and it will avoid reprocessing data which has not
  13047. changed.&lt;/p&gt;
  13048.  &lt;/li&gt;
  13049. &lt;/ul&gt;
  13050.  
  13051. &lt;p&gt;Using Lambda is now much easier thanks to the recently-announced built-in support for
  13052.  Python.  &lt;em&gt;(Note: it has always been possible to run Python-based code, but now it’s
  13053.  easier to get started.)&lt;/em&gt;&lt;/p&gt;
  13054.  
  13055. &lt;h2 id=&quot;system-design&quot;&gt;System design&lt;/h2&gt;
  13056.  
  13057. &lt;p&gt;We want a system which can react to S3 events, process and aggregate data associated
  13058.  with those events, and eventually convey some processed data back to S3 after executing
  13059.  some business logic.&lt;/p&gt;
  13060.  
  13061. &lt;h3 id=&quot;cost-comparison&quot;&gt;Cost comparison&lt;/h3&gt;
  13062.  
  13063. &lt;p&gt;We can do all of this using Lambda, but should we?  Let’s do a rough cost estimate
  13064.  assuming we have 100 instances each uploading to S3 a file containing 100 counters every
  13065.  60 seconds.&lt;/p&gt;
  13066.  
  13067. &lt;h4 id=&quot;original-system&quot;&gt;Original system&lt;/h4&gt;
  13068.  
  13069. &lt;p&gt;Suppose (1) we created our task system only to support &lt;a href=&quot;#current-solution&quot;&gt;our two tasks &lt;em&gt;A&lt;/em&gt; and
  13070.  &lt;em&gt;B&lt;/em&gt;&lt;/a&gt;, (2) that we scan S3 every five minutes, (3) that task &lt;em&gt;A&lt;/em&gt;
  13071.  completes in 60 seconds, and (4) that we deploy four times.&lt;/p&gt;
  13072.  
  13073. &lt;p&gt;We might then have the following costs:&lt;/p&gt;
  13074.  
  13075. &lt;table class=&quot;cost-comparison&quot;&gt;
  13076.  &lt;thead&gt;
  13077.    &lt;tr&gt;
  13078.      &lt;th style=&quot;text-align: right&quot;&gt;Description&lt;/th&gt;
  13079.      &lt;th style=&quot;text-align: left&quot;&gt;Cost&lt;/th&gt;
  13080.    &lt;/tr&gt;
  13081.  &lt;/thead&gt;
  13082.  &lt;tbody&gt;
  13083.    &lt;tr&gt;
  13084.      &lt;td style=&quot;text-align: right&quot;&gt;task scheduler (t2.micro)&lt;/td&gt;
  13085.      &lt;td style=&quot;text-align: left&quot;&gt;$0.013 / hour&lt;/td&gt;
  13086.    &lt;/tr&gt;
  13087.    &lt;tr&gt;
  13088.      &lt;td style=&quot;text-align: right&quot;&gt;task worker (m3.medium)&lt;/td&gt;
  13089.      &lt;td style=&quot;text-align: left&quot;&gt;$0.067 / hour&lt;/td&gt;
  13090.    &lt;/tr&gt;
  13091.    &lt;tr&gt;
  13092.      &lt;td style=&quot;text-align: right&quot;&gt;S3 usage&lt;/td&gt;
  13093.      &lt;td style=&quot;text-align: left&quot;&gt;negligible&lt;/td&gt;
  13094.    &lt;/tr&gt;
  13095.    &lt;tr&gt;
  13096.      &lt;td style=&quot;text-align: right&quot;&gt;design/configure/test/deploy task infrastructure&lt;/td&gt;
  13097.      &lt;td style=&quot;text-align: left&quot;&gt;16 hours&lt;/td&gt;
  13098.    &lt;/tr&gt;
  13099.    &lt;tr&gt;
  13100.      &lt;td style=&quot;text-align: right&quot;&gt;design/test this task&lt;/td&gt;
  13101.      &lt;td style=&quot;text-align: left&quot;&gt;4 hours&lt;/td&gt;
  13102.    &lt;/tr&gt;
  13103.    &lt;tr&gt;
  13104.      &lt;td style=&quot;text-align: right&quot;&gt;deploy this task&lt;/td&gt;
  13105.      &lt;td style=&quot;text-align: left&quot;&gt;0.5 hours / deploy&lt;/td&gt;
  13106.    &lt;/tr&gt;
  13107.    &lt;tr&gt;
  13108.      &lt;td style=&quot;text-align: right&quot;&gt;infrastructure maintenance&lt;/td&gt;
  13109.      &lt;td style=&quot;text-align: left&quot;&gt;1 hour / month&lt;/td&gt;
  13110.    &lt;/tr&gt;
  13111.  &lt;/tbody&gt;
  13112. &lt;/table&gt;
  13113.  
  13114. &lt;table class=&quot;cost-summary&quot;&gt;
  13115.  &lt;thead&gt;
  13116.    &lt;tr&gt;
  13117.      &lt;th style=&quot;text-align: right&quot;&gt; &lt;/th&gt;
  13118.      &lt;th style=&quot;text-align: left&quot;&gt; &lt;/th&gt;
  13119.    &lt;/tr&gt;
  13120.  &lt;/thead&gt;
  13121.  &lt;tbody&gt;
  13122.    &lt;tr&gt;
  13123.      &lt;td style=&quot;text-align: right&quot;&gt;Initial developer time&lt;/td&gt;
  13124.      &lt;td style=&quot;text-align: left&quot;&gt;22 hours&lt;/td&gt;
  13125.    &lt;/tr&gt;
  13126.    &lt;tr&gt;
  13127.      &lt;td style=&quot;text-align: right&quot;&gt;Ongoing developer time&lt;/td&gt;
  13128.      &lt;td style=&quot;text-align: left&quot;&gt;12 hours / year&lt;/td&gt;
  13129.    &lt;/tr&gt;
  13130.    &lt;tr&gt;
  13131.      &lt;td style=&quot;text-align: right&quot;&gt;Recurring costs&lt;/td&gt;
  13132.      &lt;td style=&quot;text-align: left&quot;&gt;$701 / year&lt;/td&gt;
  13133.    &lt;/tr&gt;
  13134.    &lt;tr&gt;
  13135.      &lt;td style=&quot;text-align: right&quot;&gt;Reaction time&lt;/td&gt;
  13136.      &lt;td style=&quot;text-align: left&quot;&gt;1 to 6 minutes&lt;/td&gt;
  13137.    &lt;/tr&gt;
  13138.  &lt;/tbody&gt;
  13139. &lt;/table&gt;
  13140.  
  13141. &lt;h4 id=&quot;lambda-based-system&quot;&gt;Lambda-based system&lt;/h4&gt;
  13142.  
  13143. &lt;p&gt;We’ll assume the following for our Lambda-based implementation:&lt;/p&gt;
  13144.  
  13145. &lt;ul&gt;
  13146.  &lt;li&gt;
  13147.    &lt;p&gt;100 files are uploaded to S3 every 60 seconds, each containing 100 counters.  We want
  13148. all data to be processed within 50 seconds.&lt;/p&gt;
  13149.  &lt;/li&gt;
  13150.  &lt;li&gt;
  13151.    &lt;p&gt;We’ll write counter values to a DynamoDB table.  To support 10,000 writes over 50
  13152. seconds, we’ll need 200 units of provisioned write capacity.  We assume our library
  13153. (&lt;a href=&quot;https://github.com/boto/boto3&quot;&gt;boto3&lt;/a&gt;) will handle retries and spread out our writes when we exceed our
  13154. provisioned write capacity and burst limits.&lt;/p&gt;
  13155.  &lt;/li&gt;
  13156.  &lt;li&gt;
  13157.    &lt;p&gt;To support the implementation described later in this post, we’ll have a second table
  13158. which is updated as a result of each change made to the first.  This second table
  13159. therefore will also need 200 units of provisioned write capacity.  It will be updated
  13160. by a second Lambda function (which we assume has the same characteristics as the
  13161. first) triggered by events on a DynamoDB stream.&lt;/p&gt;
  13162.  &lt;/li&gt;
  13163.  &lt;li&gt;
  13164.    &lt;p&gt;We’ll need to create some packaging / testing / deployment helper scripts to deploy
  13165. our code.&lt;/p&gt;
  13166.  &lt;/li&gt;
  13167. &lt;/ul&gt;
  13168.  
  13169. &lt;table class=&quot;cost-comparison&quot;&gt;
  13170.  &lt;thead&gt;
  13171.    &lt;tr&gt;
  13172.      &lt;th style=&quot;text-align: right&quot;&gt;Description&lt;/th&gt;
  13173.      &lt;th style=&quot;text-align: left&quot;&gt;Cost&lt;/th&gt;
  13174.    &lt;/tr&gt;
  13175.  &lt;/thead&gt;
  13176.  &lt;tbody&gt;
  13177.    &lt;tr&gt;
  13178.      &lt;td style=&quot;text-align: right&quot;&gt;(table 1) provision 200 DynamoDB writes/sec&lt;/td&gt;
  13179.      &lt;td style=&quot;text-align: left&quot;&gt;$0.13 / hour&lt;/td&gt;
  13180.    &lt;/tr&gt;
  13181.    &lt;tr&gt;
  13182.      &lt;td style=&quot;text-align: right&quot;&gt;(table 2) provision 200 DynamoDB writes/sec&lt;/td&gt;
  13183.      &lt;td style=&quot;text-align: left&quot;&gt;$0.13 / hour&lt;/td&gt;
  13184.    &lt;/tr&gt;
  13185.    &lt;tr&gt;
  13186.      &lt;td style=&quot;text-align: right&quot;&gt;(table 1 stream) 200 GetRecords/sec (worst case)&lt;/td&gt;
  13187.      &lt;td style=&quot;text-align: left&quot;&gt;negligible&lt;/td&gt;
  13188.    &lt;/tr&gt;
  13189.    &lt;tr&gt;
  13190.      &lt;td style=&quot;text-align: right&quot;&gt;Lambda invocation cost&lt;/td&gt;
  13191.      &lt;td style=&quot;text-align: left&quot;&gt;negligible&lt;/td&gt;
  13192.    &lt;/tr&gt;
  13193.    &lt;tr&gt;
  13194.      &lt;td style=&quot;text-align: right&quot;&gt;Lambda duration cost @ 1536 MB, 2 * 100 * 50000 ms (worst case)&lt;/td&gt;
  13195.      &lt;td style=&quot;text-align: left&quot;&gt;$0.25 / hour&lt;/td&gt;
  13196.    &lt;/tr&gt;
  13197.    &lt;tr&gt;
  13198.      &lt;td style=&quot;text-align: right&quot;&gt;design/test Lambda deployment scripts&lt;/td&gt;
  13199.      &lt;td style=&quot;text-align: left&quot;&gt;4 hours&lt;/td&gt;
  13200.    &lt;/tr&gt;
  13201.    &lt;tr&gt;
  13202.      &lt;td style=&quot;text-align: right&quot;&gt;design/test needed Lambda functions&lt;/td&gt;
  13203.      &lt;td style=&quot;text-align: left&quot;&gt;4 hours&lt;/td&gt;
  13204.    &lt;/tr&gt;
  13205.    &lt;tr&gt;
  13206.      &lt;td style=&quot;text-align: right&quot;&gt;deploy needed Lambda functions&lt;/td&gt;
  13207.      &lt;td style=&quot;text-align: left&quot;&gt;5 minutes / deploy&lt;/td&gt;
  13208.    &lt;/tr&gt;
  13209.    &lt;tr&gt;
  13210.      &lt;td style=&quot;text-align: right&quot;&gt;infrastructure maintenance&lt;/td&gt;
  13211.      &lt;td style=&quot;text-align: left&quot;&gt;$0&lt;/td&gt;
  13212.    &lt;/tr&gt;
  13213.  &lt;/tbody&gt;
  13214. &lt;/table&gt;
  13215.  
  13216. &lt;table class=&quot;cost-summary&quot;&gt;
  13217.  &lt;thead&gt;
  13218.    &lt;tr&gt;
  13219.      &lt;th style=&quot;text-align: right&quot;&gt; &lt;/th&gt;
  13220.      &lt;th style=&quot;text-align: left&quot;&gt; &lt;/th&gt;
  13221.    &lt;/tr&gt;
  13222.  &lt;/thead&gt;
  13223.  &lt;tbody&gt;
  13224.    &lt;tr&gt;
  13225.      &lt;td style=&quot;text-align: right&quot;&gt;Initial developer time&lt;/td&gt;
  13226.      &lt;td style=&quot;text-align: left&quot;&gt;8 hours&lt;/td&gt;
  13227.    &lt;/tr&gt;
  13228.    &lt;tr&gt;
  13229.      &lt;td style=&quot;text-align: right&quot;&gt;Ongoing developer time&lt;/td&gt;
  13230.      &lt;td style=&quot;text-align: left&quot;&gt;20 minutes / year&lt;/td&gt;
  13231.    &lt;/tr&gt;
  13232.    &lt;tr&gt;
  13233.      &lt;td style=&quot;text-align: right&quot;&gt;Recurring costs (DynamoDB)&lt;/td&gt;
  13234.      &lt;td style=&quot;text-align: left&quot;&gt;$2279 / year&lt;/td&gt;
  13235.    &lt;/tr&gt;
  13236.    &lt;tr&gt;
  13237.      &lt;td style=&quot;text-align: right&quot;&gt;Recurring costs (Lambda)&lt;/td&gt;
  13238.      &lt;td style=&quot;text-align: left&quot;&gt;$2192 / year&lt;/td&gt;
  13239.    &lt;/tr&gt;
  13240.    &lt;tr&gt;
  13241.      &lt;td style=&quot;text-align: right&quot;&gt;Reaction time&lt;/td&gt;
  13242.      &lt;td style=&quot;text-align: left&quot;&gt;0.1 to 51 seconds&lt;/td&gt;
  13243.    &lt;/tr&gt;
  13244.  &lt;/tbody&gt;
  13245. &lt;/table&gt;
  13246.  
  13247. &lt;h4 id=&quot;worth-it-possibly&quot;&gt;Worth it? Possibly.&lt;/h4&gt;
  13248.  
  13249. &lt;ul&gt;
  13250.  &lt;li&gt;
  13251.    &lt;p&gt;We’ve greatly improved the upper (7x) and lower (600x) bounds for hypothetical system
  13252. reaction time (length of time after a file upload before we can start observing its
  13253. effects).  With the Lambda-based system, we start updating our stored state in
  13254. DynamoDB as soon as our function begins executing.&lt;/p&gt;
  13255.  &lt;/li&gt;
  13256.  &lt;li&gt;
  13257.    &lt;p&gt;Less developer time is required to get a working system.  The original design requires
  13258. almost 3x initial developer time, and almost 40x ongoing time to maintain.  The Lambda
  13259. solution saves 3 developer-days of time in the first year.&lt;/p&gt;
  13260.  &lt;/li&gt;
  13261.  &lt;li&gt;
  13262.    &lt;p&gt;More money is required on an ongoing basis—around 6.5x—but we don’t have
  13263. to manage any infrastructure, which leaves more time for other projects.&lt;/p&gt;
  13264.  &lt;/li&gt;
  13265.  &lt;li&gt;
  13266.    &lt;p&gt;It’s easier to scale (up to a point):&lt;/p&gt;
  13267.  
  13268.    &lt;ul&gt;
  13269.      &lt;li&gt;
  13270.        &lt;p&gt;If we wanted to halve our maximum reaction time, we could double our provisioned
  13271. DynamoDB write capacity without changing anything else.  &lt;em&gt;(This would double our
  13272. DynamoDB costs and halve our worst-case Lambda costs, for a total recurring cost
  13273. increase of 26%.)&lt;/em&gt;&lt;/p&gt;
  13274.      &lt;/li&gt;
  13275.      &lt;li&gt;
  13276.        &lt;p&gt;If our RTB system were to suddenly double in size, we could simply double our
  13277. provisioned write capacity using the console without exploring alternate
  13278. designs or modifying our infrastructure!&lt;/p&gt;
  13279.      &lt;/li&gt;
  13280.    &lt;/ul&gt;
  13281.  &lt;/li&gt;
  13282. &lt;/ul&gt;
  13283.  
  13284. &lt;h4 id=&quot;theres-more-than-one-way-to-do-it&quot;&gt;There’s More Than One Way To Do It™&lt;/h4&gt;
  13285.  
  13286. &lt;p&gt;We could take a different approach which (1) doesn’t involve Lambda or DynamoDB, or (2)
  13287.  uses these services in a different way.  This cost estimate shows that using Lambda
  13288.  for a system like the one described here is within the realm of possibility.&lt;/p&gt;
  13289.  
  13290. &lt;h4 id=&quot;eventual-consistency&quot;&gt;Eventual consistency&lt;/h4&gt;
  13291.  
  13292. &lt;p&gt;Working with distributed systems requires balancing trade-offs.  It’s worth noting that
  13293.  the Lambda-based system we describe exhibits eventual consistency: the state we can
  13294.  observe at any particular moment in time may not yet include all updates we’ve made, but
  13295.  all updates will eventually be observable if we stop performing them.  This is due to
  13296.  the use of DynamoDB and S3 as well as the system’s overall design.&lt;/p&gt;
  13297.  
  13298. &lt;ul&gt;
  13299.  &lt;li&gt;
  13300.    &lt;p&gt;DynamoDB supports eventually-consistent reads (the default) as well as
  13301. strongly-consistent reads.  The consistency time scale is relatively small (usually
  13302. “within one second”) compared to other parts of our system.  Whenever we read an item,
  13303. we’ll observe a state which is the result of some sequence of atomic updates (but not
  13304. necessarily all of the most recent updates).  We can live with this trade-off.&lt;/p&gt;
  13305.  &lt;/li&gt;
  13306.  &lt;li&gt;
  13307.    &lt;p&gt;S3 supports eventual consistency for overwritten objects, and the typical consistency
  13308. time scale is again relatively small—on the order of seconds.  We shouldn’t need
  13309. to worry about it as long as we don’t receive out-of-order event notifications for the
  13310. same key: we’ll always read data which was valid at one point in time, and we can
  13311. check timestamps and versions if we’re concerned about stale events or reads.&lt;/p&gt;
  13312.  &lt;/li&gt;
  13313. &lt;/ul&gt;
  13314.  
  13315. &lt;p&gt;Our system as a whole wouldn’t be strongly-consistent even if S3 and DynamoDB presented
  13316.  only strongly-consistent interfaces, as we can observe our system’s output (contents of
  13317.  second DynamoDB table) before all current input (S3 file uploads) has been processed.
  13318.  We essentially have a view of an atomic update stream which periodically will be a few
  13319.  seconds out of date; this is fine for our purposes.&lt;/p&gt;
  13320.  
  13321. &lt;h2 id=&quot;implementation&quot;&gt;Implementation&lt;/h2&gt;
  13322.  
  13323. &lt;p&gt;Enough background—let’s implement our system!&lt;/p&gt;
  13324.  
  13325. &lt;p&gt;We need to react to S3 file uploads.  We can configure our S3 bucket to emit events
  13326.  which will cause a Lambda function to run.  Our first new component will therefore look
  13327.  like this:&lt;/p&gt;
  13328.  
  13329. &lt;p&gt;&lt;img src=&quot;/images/post_images/counter-upload-processor-1.png&quot; alt=&quot;initial counter upload processor&quot; /&gt;&lt;/p&gt;
  13330.  
  13331. &lt;h4 id=&quot;sns-topic-creation&quot;&gt;SNS topic creation&lt;/h4&gt;
  13332.  
  13333. &lt;p&gt;We’ll create an SNS topic to receive S3 events.  While Lambda directly supports S3
  13334.  events, using an SNS topic will allow us to more easily configure additional S3 event
  13335.  handlers in the future for events having the same prefix.  &lt;em&gt;(S3 doesn’t currently
  13336.  support event handlers dispatching on prefixes which overlap.)&lt;/em&gt;&lt;/p&gt;
  13337.  
  13338. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;REGION&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;us-west-2&quot;&lt;/span&gt;
  13339. &lt;span class=&quot;nv&quot;&gt;BUCKET&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;your-bucket-name&quot;&lt;/span&gt;
  13340. &lt;span class=&quot;nv&quot;&gt;TOPIC_NAME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;your-file-upload-topic&quot;&lt;/span&gt;
  13341. &lt;span class=&quot;nv&quot;&gt;SOURCE_ARN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;arn:aws:s3:::&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$BUCKET&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  13342.  
  13343. &lt;span class=&quot;c&quot;&gt;# creating a topic gives us its ARN.  we&apos;ll need that later along with&lt;/span&gt;
  13344. &lt;span class=&quot;c&quot;&gt;# our account number:&lt;/span&gt;
  13345. &lt;span class=&quot;nv&quot;&gt;TOPIC_ARN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;aws sns create-topic &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13346.              &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;   &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13347.              &lt;span class=&quot;nt&quot;&gt;--name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$TOPIC_NAME&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13348.              &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; text      &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13349.              &lt;span class=&quot;nt&quot;&gt;--query&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;TopicArn&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13350. &lt;span class=&quot;nv&quot;&gt;ACCOUNT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$TOPIC_ARN&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;awk&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-F&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;:&apos;&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;{print $5}&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13351.  
  13352. &lt;p&gt;We also need to configure the SNS topic’s policy to allow our S3 bucket to publish
  13353.  events to it:&lt;/p&gt;
  13354.  
  13355. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;POLICY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;import json; print json.dumps(
  13356. {&apos;Version&apos;: &apos;2008-10-17&apos;,
  13357.       &apos;Id&apos;: &apos;upload-events-policy&apos;,
  13358. &apos;Statement&apos;: [
  13359.  
  13360.  {&apos;Resource&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$TOPIC_ARN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos;,
  13361.     &apos;Effect&apos;: &apos;Allow&apos;,
  13362.        &apos;Sid&apos;: &apos;allow_s3_publish&apos;,
  13363.     &apos;Action&apos;: &apos;SNS:Publish&apos;,
  13364.  &apos;Condition&apos;: {&apos;ArnEquals&apos;: {&apos;aws:sourceArn&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$SOURCE_ARN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos;}},
  13365.  &apos;Principal&apos;: &apos;*&apos;},
  13366.  
  13367.  {&apos;Resource&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$TOPIC_ARN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos;,
  13368.     &apos;Effect&apos;: &apos;Allow&apos;,
  13369.        &apos;Sid&apos;: &apos;owner_sns_permissions&apos;,
  13370.     &apos;Action&apos;: [&apos;SNS:Subscribe&apos;, &apos;SNS:ListSubscriptionsByTopic&apos;,
  13371.                &apos;SNS:DeleteTopic&apos;, &apos;SNS:GetTopicAttributes&apos;,
  13372.                &apos;SNS:Publish&apos;, &apos;SNS:RemovePermission&apos;,
  13373.                &apos;SNS:AddPermission&apos;, &apos;SNS:Receive&apos;,
  13374.                &apos;SNS:SetTopicAttributes&apos;],
  13375.  &apos;Condition&apos;: {&apos;StringEquals&apos;: {&apos;AWS:SourceOwner&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ACCOUNT&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos;}},
  13376.  &apos;Principal&apos;: {&apos;AWS&apos;: &apos;*&apos;}}
  13377. ]})&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13378.  
  13379. aws sns set-topic-attributes &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13380.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;           &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13381.  &lt;span class=&quot;nt&quot;&gt;--topic-arn&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$TOPIC_ARN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;   &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13382.  &lt;span class=&quot;nt&quot;&gt;--attribute-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Policy&quot;&lt;/span&gt;  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13383.  &lt;span class=&quot;nt&quot;&gt;--attribute-value&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$POLICY&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13384.  
  13385. &lt;h4 id=&quot;s3-bucket-configuration&quot;&gt;S3 bucket configuration&lt;/h4&gt;
  13386.  
  13387. &lt;p&gt;We need to configure our S3 bucket to publish events to the topic.  If any event
  13388.  settings already exist for the bucket, we should instead do this in the S3 console:&lt;/p&gt;
  13389.  
  13390. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;# we&apos;re interested in events occurring under this prefix:&lt;/span&gt;
  13391. &lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;path/to/data/&quot;&lt;/span&gt;
  13392.  
  13393. &lt;span class=&quot;c&quot;&gt;# we don&apos;t want to accidentally overwrite an existing event&lt;/span&gt;
  13394. &lt;span class=&quot;c&quot;&gt;# configuration:&lt;/span&gt;
  13395. &lt;span class=&quot;nv&quot;&gt;EXISTING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;aws s3api get-bucket-notification-configuration &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13396.             &lt;span class=&quot;nt&quot;&gt;--bucket&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$BUCKET&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13397.  
  13398. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-z&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$EXISTING&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then&lt;/span&gt;
  13399.  &lt;span class=&quot;c&quot;&gt;# there was no existing configuration.  install our new one:&lt;/span&gt;
  13400.  &lt;span class=&quot;nv&quot;&gt;EVENT_CONFIG&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;import json; print json.dumps(
  13401.   {&apos;TopicConfigurations&apos;: [
  13402.     {      &apos;Id&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$TOPIC_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; events&apos;,
  13403.      &apos;TopicArn&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$TOPIC_ARN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos;,
  13404.        &apos;Events&apos;: [&apos;s3:ObjectCreated:*&apos;],
  13405.        &apos;Filter&apos;: {
  13406.          &apos;Key&apos;: { &apos;FilterRules&apos;: [
  13407.                    { &apos;Name&apos;: &apos;prefix&apos;, &apos;Value&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PREFIX&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos; }
  13408.                    ]}
  13409.        }
  13410.     }]})&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13411.  
  13412.  aws s3api put-bucket-notification-configuration &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13413.    &lt;span class=&quot;nt&quot;&gt;--bucket&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$BUCKET&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13414.    &lt;span class=&quot;nt&quot;&gt;--notification-configuration&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$EVENT_CONFIG&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  13415. &lt;span class=&quot;k&quot;&gt;else
  13416.  &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;bucket already has an event configuration.&quot;&lt;/span&gt;
  13417.  &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;use the S3 console to configure events.&quot;&lt;/span&gt;
  13418. &lt;span class=&quot;k&quot;&gt;fi&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13419.  
  13420. &lt;h4 id=&quot;lambda-function-creation&quot;&gt;Lambda function creation&lt;/h4&gt;
  13421.  
  13422. &lt;p&gt;We need to choose a name for our first Lambda function, create a role to use when
  13423.  executing, and enable AWS Lambda to assume that role so it can actually execute our
  13424.  function:&lt;/p&gt;
  13425.  
  13426. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;FUNCTION1_NAME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;counter-upload-processor&quot;&lt;/span&gt;
  13427.  
  13428. &lt;span class=&quot;nv&quot;&gt;ASSUMEROLE_POLICY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;import json; print json.dumps(
  13429.  {  &apos;Version&apos;: &apos;2012-10-17&apos;,
  13430.   &apos;Statement&apos;: [{      &apos;Sid&apos;: &apos;&apos;,
  13431.                     &apos;Effect&apos;: &apos;Allow&apos;,
  13432.                  &apos;Principal&apos;: {&apos;Service&apos;: &apos;lambda.amazonaws.com&apos;},
  13433.                     &apos;Action&apos;: &apos;sts:AssumeRole&apos;
  13434.                 }] })&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13435.  
  13436. &lt;span class=&quot;nv&quot;&gt;ROLE1_NAME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$FUNCTION1_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;-execute&quot;&lt;/span&gt;
  13437.  
  13438. &lt;span class=&quot;c&quot;&gt;# create the execution role and save its ARN for later:&lt;/span&gt;
  13439. &lt;span class=&quot;nv&quot;&gt;EXEC_ROLE1_ARN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;aws iam create-role        &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13440.                  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;          &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13441.                  &lt;span class=&quot;nt&quot;&gt;--role-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ROLE1_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13442.                  &lt;span class=&quot;nt&quot;&gt;--assume-role-policy-document&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ASSUMEROLE_POLICY&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13443.                  &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; text &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13444.                  &lt;span class=&quot;nt&quot;&gt;--query&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;Role.Arn&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13445.  
  13446. &lt;p&gt;Lambda functions normally emit logs using CloudWatch to a log group whose name is
  13447.  derived from the function’s name.  Let’s create a log group for our function:&lt;/p&gt;
  13448.  
  13449. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;LOG_GROUP1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;/aws/lambda/&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$FUNCTION1_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  13450.  
  13451. aws logs create-log-group &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13452.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;        &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13453.  &lt;span class=&quot;nt&quot;&gt;--log-group-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$LOG_GROUP1&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13454.  
  13455. &lt;p&gt;To actually emit logs, we need to grant some permissions to the role used by our Lambda
  13456.  function:&lt;/p&gt;
  13457.  
  13458. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;CLOUDWATCH_POLICY1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;import json; print json.dumps(
  13459.  { &apos;Version&apos;: &apos;2012-10-17&apos;,
  13460.    &apos;Statement&apos;: [
  13461.    { &apos;Effect&apos;: &apos;Allow&apos;,
  13462.      &apos;Action&apos;: [&apos;logs:PutLogEvents&apos;,
  13463.                 &apos;logs:CreateLogStream&apos;],
  13464.    &apos;Resource&apos;: &apos;arn:aws:logs:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ACCOUNT&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:log-group:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$LOG_GROUP1&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:*&apos; }
  13465.   ]})&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13466.  
  13467. aws iam put-role-policy    &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13468.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;         &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13469.  &lt;span class=&quot;nt&quot;&gt;--role-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ROLE1_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13470.  &lt;span class=&quot;nt&quot;&gt;--policy-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;emit-cloudwatch-logs&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13471.  &lt;span class=&quot;nt&quot;&gt;--policy-document&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CLOUDWATCH_POLICY1&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13472.  
  13473. &lt;p&gt;We can now create a Python Lambda function.  This naming convention and zip file layout
  13474.  will allow it to be edited using the AWS Lambda console:&lt;/p&gt;
  13475.  
  13476. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;lambda_function.py &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOF&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;
  13477. import logging
  13478.  
  13479. def lambda_handler(event, context):
  13480.    logging.getLogger().setLevel(logging.INFO)
  13481.    logging.info(&apos;got event: {}&apos;.format(event))
  13482.    logging.info(&apos;got context: {}&apos;.format(context))
  13483.    return False
  13484. &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOF
  13485.  
  13486. &lt;/span&gt;zip &lt;span class=&quot;nt&quot;&gt;-j&lt;/span&gt; lambda-example-1.zip lambda_function.py
  13487.  
  13488. &lt;span class=&quot;c&quot;&gt;# we&apos;ll need the function ARN later:&lt;/span&gt;
  13489. &lt;span class=&quot;nv&quot;&gt;FUNCTION1_ARN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;aws lambda create-function  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13490.                  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;          &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13491.                  &lt;span class=&quot;nt&quot;&gt;--runtime&lt;/span&gt; python2.7       &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13492.                  &lt;span class=&quot;nt&quot;&gt;--role&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$EXEC_ROLE1_ARN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13493.                  &lt;span class=&quot;nt&quot;&gt;--description&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;counter file reverse mapper&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13494.                  &lt;span class=&quot;nt&quot;&gt;--timeout&lt;/span&gt; 10      &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13495.                  &lt;span class=&quot;nt&quot;&gt;--memory-size&lt;/span&gt; 128 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13496.                  &lt;span class=&quot;nt&quot;&gt;--handler&lt;/span&gt; lambda_function.lambda_handler &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13497.                  &lt;span class=&quot;nt&quot;&gt;--zip-file&lt;/span&gt; fileb://lambda-example-1.zip  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13498.                  &lt;span class=&quot;nt&quot;&gt;--function-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$FUNCTION1_NAME&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13499.                  &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; text &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13500.                  &lt;span class=&quot;nt&quot;&gt;--query&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;FunctionArn&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13501.  
  13502. &lt;p&gt;Let’s manually test the function:&lt;/p&gt;
  13503.  
  13504. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;aws lambda invoke &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13505.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13506.  &lt;span class=&quot;nt&quot;&gt;--function-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$FUNCTION1_ARN&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13507.  &lt;span class=&quot;nt&quot;&gt;--payload&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;{}&apos;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13508.  &lt;span class=&quot;nt&quot;&gt;--log-type&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;Tail&apos;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13509.  &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; text &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13510.  &lt;span class=&quot;nt&quot;&gt;--query&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;LogResult&apos;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13511.  - &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13512.  | &lt;span class=&quot;nb&quot;&gt;base64&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--decode&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13513.  
  13514. &lt;p&gt;The result should look like this:&lt;/p&gt;
  13515.  
  13516. &lt;pre&gt;
  13517. START RequestId: e7739dcd-7380-11e5-aa63-c740254a89b5 Version: $LATEST
  13518. [INFO]  2015-11-16T21:08:38.830Z    e7739dcd-7380-11e5-aa63-c740254a89b5    got event {}
  13519. [INFO]  2015-11-16T21:08:38.830Z    e7739dcd-7380-11e5-aa63-c740254a89b5    got context &amp;lt;__main__.LambdaContext object at 0x7f61bfd246d0&amp;gt;
  13520. END RequestId: e7739dcd-7380-11e5-aa63-c740254a89b5
  13521. REPORT RequestId: e7739dcd-7380-11e5-aa63-c740254a89b5  Duration: 0.41 ms   Billed Duration: 100 ms     Memory Size: 128 MB Max Memory Used: 13 MB
  13522. &lt;/pre&gt;
  13523.  
  13524. &lt;h4 id=&quot;putting-it-all-together&quot;&gt;Putting it all together&lt;/h4&gt;
  13525.  
  13526. &lt;p&gt;We now have a working Lambda function implemented in Python.  We can subscribe it to the
  13527.  SNS topic we created earlier:&lt;/p&gt;
  13528.  
  13529. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;aws sns subscribe   &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13530.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13531.  &lt;span class=&quot;nt&quot;&gt;--protocol&lt;/span&gt; lambda &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13532.  &lt;span class=&quot;nt&quot;&gt;--topic-arn&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$TOPIC_ARN&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13533.  &lt;span class=&quot;nt&quot;&gt;--notification-endpoint&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$FUNCTION1_ARN&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13534.  
  13535. &lt;p&gt;But before things will actually work, we need to grant the SNS topic permission to
  13536.  invoke our Lambda function:&lt;/p&gt;
  13537.  
  13538. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;aws lambda add-permission       &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13539.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;              &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13540.  &lt;span class=&quot;nt&quot;&gt;--function-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$FUNCTION1_ARN&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13541.  &lt;span class=&quot;nt&quot;&gt;--statement-id&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$FUNCTION1_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;-invoke&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13542.  &lt;span class=&quot;nt&quot;&gt;--principal&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;sns.amazonaws.com&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13543.  &lt;span class=&quot;nt&quot;&gt;--action&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;lambda:InvokeFunction&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13544.  &lt;span class=&quot;nt&quot;&gt;--source-arn&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$TOPIC_ARN&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13545.  
  13546. &lt;p&gt;As objects are uploaded to our S3 bucket, we should start seeing records of invocations
  13547.  and logs appearing in the CloudWatch log group we created earlier.  Let’s upload a test
  13548.  file:&lt;/p&gt;
  13549.  
  13550. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1 12345&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13551.    | aws s3 &lt;span class=&quot;nb&quot;&gt;cp&lt;/span&gt; - s3://&lt;span class=&quot;nv&quot;&gt;$BUCKET&lt;/span&gt;/&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;date&lt;/span&gt; +&lt;span class=&quot;s1&quot;&gt;&apos;%Y-%m-%d&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;/values_i-deadbeef&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13552.  
  13553. &lt;p&gt;We can view our CloudWatch logs from the command line.  We should see some output
  13554.  corresponding to the key we just uploaded:&lt;/p&gt;
  13555.  
  13556. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;STREAM_NAME &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;aws logs describe-log-streams &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13557.                         &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13558.                         &lt;span class=&quot;nt&quot;&gt;--log-group-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$LOG_GROUP1&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13559.                         &lt;span class=&quot;nt&quot;&gt;--descending&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13560.                         &lt;span class=&quot;nt&quot;&gt;--order-by&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;LastEventTime&apos;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13561.                         &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; text &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13562.                         &lt;span class=&quot;nt&quot;&gt;--query&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;logStreams[].logStreamName&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do
  13563.    &lt;/span&gt;aws logs get-log-events &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13564.        &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13565.        &lt;span class=&quot;nt&quot;&gt;--log-group-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$LOG_GROUP1&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13566.        &lt;span class=&quot;nt&quot;&gt;--log-stream-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$STREAM_NAME&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13567.        &lt;span class=&quot;nt&quot;&gt;--start-from-head&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13568.        &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; text &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13569.        &lt;span class=&quot;nt&quot;&gt;--query&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;events[].message&apos;&lt;/span&gt;
  13570. &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13571.  
  13572. &lt;p&gt;If you see something like this, it worked:&lt;/p&gt;
  13573.  
  13574. &lt;pre&gt;
  13575. [INFO] 2015-11-16T23:44:15.876Z 1ad60ef6-3417-12c5-92b0-0f211eba5fc1 got event: {u&apos;Records&apos;: [{u&apos;EventVersion&apos;: u&apos;1.0&apos;, u&apos;EventSubscriptionArn&apos;: ... u&apos;EventSource&apos;: u&apos;aws:sns&apos;, u&apos;Sns&apos;: { ... u&apos;Message&apos;: u&apos;{&quot;Records&quot;:[{&quot;eventVersion&quot;:&quot;2.0&quot;,&quot;eventSource&quot;:&quot;aws:s3&quot;, ...
  13576. &lt;/pre&gt;
  13577.  
  13578. &lt;h3 id=&quot;data-persistence&quot;&gt;Data persistence&lt;/h3&gt;
  13579.  
  13580. &lt;p&gt;Our Lambda function is now hooked up to events emitted from S3!  However, it doesn’t yet
  13581.  do anything useful.  We want our system to react to events and to only reprocess data
  13582.  which has changed.  We’ll achieve this by (1) storing some intermediate state in
  13583.  DynamoDB and (2) using a second Lambda function to process updates to this state.&lt;/p&gt;
  13584.  
  13585. &lt;p&gt;Our updated system will look like this:&lt;/p&gt;
  13586.  
  13587. &lt;p&gt;&lt;img src=&quot;/images/post_images/counter-upload-processor-2.png&quot; alt=&quot;updated counter upload processor&quot; /&gt;&lt;/p&gt;
  13588.  
  13589. &lt;h4 id=&quot;create-a-dynamodb-table&quot;&gt;Create a DynamoDB table&lt;/h4&gt;
  13590.  
  13591. &lt;p&gt;We want to store items which look like this in our intermediate state table:&lt;/p&gt;
  13592.  
  13593. &lt;pre&gt;
  13594.  
  13595.    {       &apos;Counter&apos;: { &apos;S&apos;: &apos;XXXXXXXXXXXXXXXXXXXXXX&apos; },
  13596.               &apos;Date&apos;: { &apos;S&apos;: &apos;YYYY-MM-DD&apos; },
  13597.     &apos;InstanceValues&apos;: { &apos;M&apos;: { &apos;i-deadbeef&apos;: { &apos;N&apos;: &apos;1234&apos; },
  13598.                                &apos;i-beefdead&apos;: { &apos;N&apos;: &apos;5678&apos; },
  13599.                                ...
  13600.                              }
  13601.                       }
  13602.    }
  13603.  
  13604. &lt;/pre&gt;
  13605.  
  13606. &lt;p&gt;Using a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt; type (“M”) and &lt;a href=&quot;http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.AccessingItemAttributes.html#DocumentPaths&quot;&gt;document paths&lt;/a&gt; to store
  13607.  per-instance counter values will allow concurrently-executing Lambda function invocations to
  13608.  update the same DynamoDB item without interfering with each other.&lt;/p&gt;
  13609.  
  13610. &lt;p&gt;Our table schema will look like this:&lt;/p&gt;
  13611.  
  13612. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;TABLE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$FUNCTION1_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;-state&quot;&lt;/span&gt;
  13613.  
  13614. aws dynamodb create-table        &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13615.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;               &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13616.  &lt;span class=&quot;nt&quot;&gt;--table-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$TABLE&lt;/span&gt;            &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13617.  &lt;span class=&quot;nt&quot;&gt;--provisioned-throughput&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ReadCapacityUnits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;10,WriteCapacityUnits&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;10 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13618.  &lt;span class=&quot;nt&quot;&gt;--key-schema&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;AttributeName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Counter,KeyType&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;HASH                     &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13619.               &lt;span class=&quot;nv&quot;&gt;AttributeName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Date,KeyType&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;RANGE                       &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13620.  &lt;span class=&quot;nt&quot;&gt;--attribute-definitions&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;AttributeName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Counter,AttributeType&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;S       &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13621.                          &lt;span class=&quot;nv&quot;&gt;AttributeName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Date,AttributeType&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;S&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13622.  
  13623. &lt;h4 id=&quot;s3-sns-event-handling&quot;&gt;S3-SNS event handling&lt;/h4&gt;
  13624.  
  13625. &lt;p&gt;We want to update our intermediate state whenever an event comes in.  Our SNS events
  13626.  will look like this:&lt;/p&gt;
  13627.  
  13628. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-json&quot; data-lang=&quot;json&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Records&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13629. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;EventVersion&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;1.0&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13630.  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;EventSubscriptionArn&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;SUBSCRIPTION_ARN&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13631.  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;EventSource&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;aws:sns&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13632.  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Sns&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13633.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;SignatureVersion&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13634.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Timestamp&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2015-11-16T22:08:36.726Z&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13635.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Signature&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;SIG_DATA&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13636.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;SigningCertUrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;SIG_CERT_URL&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13637.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;MessageId&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;MESSAGE_ID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13638.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Message&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;JSON_MESSAGE_DATA&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13639.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;MessageAttributes&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13640.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Notification&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13641.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;UnsubscribeUrl&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;UNSUBSCRIBE_URL&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13642.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;TopicArn&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;$TOPIC_ARN&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13643.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Subject&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Amazon S3 Notification&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13644.  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13645. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13646. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13647.  
  13648. &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;JSON_MESSAGE_DATA&lt;/code&gt; value in these events will be a json-encoded S3 event.  It will
  13649.  look like this when decoded:&lt;/p&gt;
  13650.  
  13651. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-json&quot; data-lang=&quot;json&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Records&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13652. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eventVersion&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2.0&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13653.  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eventTime&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2015-11-16T22:08:36.682Z&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13654.  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;requestParameters&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;sourceIPAddress&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;1.2.3.4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13655.  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;s3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13656.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;s3SchemaVersion&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;1.0&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13657.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;configurationId&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;$TOPIC_NAME events&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13658.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;object&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;versionId&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;NEW_OBJECT_VERSION_ID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13659.                    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eTag&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;NEW_OBJECT_ETAG&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13660.               &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;sequencer&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;SEQUENCER_VALUE&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13661.                     &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;key&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;${PREFIX}2015-11-16/values_i-deadbeef&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13662.                    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;size&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13663.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;bucket&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;          &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;arn&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;$SOURCE_ARN&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13664.                        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;$BUCKET&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13665.               &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;ownerIdentity&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;principalId&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;OWNING_PRINCIPAL_ID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13666.  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13667.  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;responseElements&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;x-amz-id-2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;AMZ_ID_1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13668.                       &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;x-amz-request-id&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;AMZ_ID_2&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13669.         &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;awsRegion&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;$REGION&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13670.         &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eventName&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;ObjectCreated:Put&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13671.      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;userIdentity&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;principalId&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;CREATING_PRINCIPAL_ID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13672.       &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eventSource&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;aws:s3&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13673. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13674. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13675.  
  13676. &lt;p&gt;&lt;em&gt;(Note: the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;key&lt;/code&gt; value in the event is url-encoded.)&lt;/em&gt;&lt;/p&gt;
  13677.  
  13678. &lt;p&gt;Let’s update the code in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lambda_function.py&lt;/code&gt; to process these events:&lt;/p&gt;
  13679.  
  13680. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;boto3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;botocore&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exceptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urllib&lt;/span&gt;
  13681.  
  13682.  
  13683. &lt;span class=&quot;n&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;counter-upload-processor-state&quot;&lt;/span&gt;
  13684. &lt;span class=&quot;n&quot;&gt;FILENAME_PREFIX&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;values_&apos;&lt;/span&gt;
  13685.  
  13686. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;lambda_handler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  13687.  &lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getLogger&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;setLevel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;INFO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13688.  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Records&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
  13689.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;aws:sns&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;EventSource&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Sns&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Message&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
  13690.      &lt;span class=&quot;n&quot;&gt;handle_sns_event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Sns&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Message&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13691.  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
  13692.  
  13693.  
  13694. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;handle_sns_event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  13695.  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Records&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
  13696.    &lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;looking at {}&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  13697.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;aws:s3&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;eventSource&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; \
  13698.      &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;eventName&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;startswith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;ObjectCreated:&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  13699.      &lt;span class=&quot;n&quot;&gt;region&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;awsRegion&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  13700.      &lt;span class=&quot;n&quot;&gt;bucket_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;s3&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;bucket&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;name&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  13701.      &lt;span class=&quot;n&quot;&gt;key_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urllib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unquote&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;s3&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;object&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;key&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  13702.      &lt;span class=&quot;n&quot;&gt;key_vsn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;s3&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;object&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;versionId&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13703.      &lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;new object: s3://{}/{} (v:{})&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bucket_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13704.                                                          &lt;span class=&quot;n&quot;&gt;key_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13705.                                                          &lt;span class=&quot;n&quot;&gt;key_vsn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  13706.      &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;boto3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;s3&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;region_name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; \
  13707.                 &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bucket_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; \
  13708.                 &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Object&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13709.      &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;VersionId&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key_vsn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key_vsn&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{})&lt;/span&gt;
  13710.      &lt;span class=&quot;n&quot;&gt;process_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13711.  
  13712.  
  13713. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;process_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  13714.  &lt;span class=&quot;n&quot;&gt;filename&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;/&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  13715.  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;startswith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FILENAME_PREFIX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  13716.    &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;/&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  13717.    &lt;span class=&quot;n&quot;&gt;instance_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;_&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  13718.    &lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;processing ({}, {})&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  13719.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Body&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;splitlines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
  13720.      &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()]&lt;/span&gt;
  13721.      &lt;span class=&quot;n&quot;&gt;update_instance_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13722.  
  13723.  
  13724. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;update_instance_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  13725.  &lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;updating instance counter value: {} {} {}&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  13726.      &lt;span class=&quot;n&quot;&gt;instance_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  13727.  &lt;span class=&quot;n&quot;&gt;tbl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;boto3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;dynamodb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;region_name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; \
  13728.             &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TABLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13729.  &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Counter&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13730.            &lt;span class=&quot;s&quot;&gt;&apos;Date&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  13731.  &lt;span class=&quot;c1&quot;&gt;# updating a document path in an item currently fails if the ancestor
  13732. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# attributes don&apos;t exist, and multiple SET expressions can&apos;t
  13733. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# (currently) be used to update overlapping document paths (even with
  13734. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# `if_not_exists`), so we must first create the `InstanceValues` map
  13735. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# if needed.  we use a condition expression to avoid needlessly
  13736. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# triggering an update event on the stream we&apos;ll create for this
  13737. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# table.  in a real application, we might first query the table to
  13738. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# check if these updates are actually needed (reads are cheaper than
  13739. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# writes).
  13740. &lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;lax_update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tbl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13741.             &lt;span class=&quot;n&quot;&gt;Key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13742.             &lt;span class=&quot;n&quot;&gt;UpdateExpression&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;SET #valuemap = :empty&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13743.             &lt;span class=&quot;n&quot;&gt;ExpressionAttributeNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;#valuemap&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;InstanceValues&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  13744.             &lt;span class=&quot;n&quot;&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;:empty&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}},&lt;/span&gt;
  13745.             &lt;span class=&quot;n&quot;&gt;ConditionExpression&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;attribute_not_exists(#valuemap)&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13746.  &lt;span class=&quot;c1&quot;&gt;# we can now actually update the target path.  we only update if the
  13747. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# new value is different (in a real application, we might first query
  13748. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# and refrain from attempting the conditional write if the value is
  13749. &lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# unchanged):
  13750. &lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;lax_update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tbl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13751.             &lt;span class=&quot;n&quot;&gt;Key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13752.             &lt;span class=&quot;n&quot;&gt;UpdateExpression&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;SET #valuemap.#key = :value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13753.             &lt;span class=&quot;n&quot;&gt;ExpressionAttributeNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;     &lt;span class=&quot;s&quot;&gt;&apos;#key&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  13754.                                       &lt;span class=&quot;s&quot;&gt;&apos;#valuemap&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;InstanceValues&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  13755.             &lt;span class=&quot;n&quot;&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;:value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)},&lt;/span&gt;
  13756.             &lt;span class=&quot;n&quot;&gt;ConditionExpression&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;NOT #valuemap.#key = :value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13757.  
  13758.  
  13759. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;lax_update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  13760.  &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  13761.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;update_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  13762.  &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;botocore&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exceptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ClientError&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  13763.    &lt;span class=&quot;n&quot;&gt;code&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Error&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Code&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  13764.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;ConditionalCheckFailedException&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;code&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  13765.      &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13766.  
  13767. &lt;p&gt;We can now update our Lambda function’s code:&lt;/p&gt;
  13768.  
  13769. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;zip &lt;span class=&quot;nt&quot;&gt;-j&lt;/span&gt; lambda-example-2.zip lambda_function.py
  13770.  
  13771. aws lambda update-function-code  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13772.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;               &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13773.  &lt;span class=&quot;nt&quot;&gt;--function-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$FUNCTION1_ARN&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13774.  &lt;span class=&quot;nt&quot;&gt;--zip-file&lt;/span&gt; fileb://lambda-example-2.zip&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13775.  
  13776. &lt;p&gt;Before this updated code will work, we need to grant permission to the execution role we
  13777.  created earlier for our Lambda function—otherwise, it won’t be able to access our S3
  13778.  bucket or update our DynamoDB table:&lt;/p&gt;
  13779.  
  13780. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;REV_MAPPER_POLICY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;import json; print json.dumps(
  13781.  { &apos;Version&apos;: &apos;2012-10-17&apos;,
  13782.    &apos;Statement&apos;: [
  13783.  
  13784.    { &apos;Effect&apos;: &apos;Allow&apos;,
  13785.      &apos;Action&apos;: [&apos;dynamodb:UpdateItem&apos;],
  13786.    &apos;Resource&apos;: &apos;arn:aws:dynamodb:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ACCOUNT&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:table/&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$TABLE&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos; },
  13787.  
  13788.    { &apos;Effect&apos;: &apos;Allow&apos;,
  13789.      &apos;Action&apos;: [&apos;s3:GetObject&apos;,
  13790.                 &apos;s3:GetObjectVersion&apos;],
  13791.    &apos;Resource&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$SOURCE_ARN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;*&apos; }
  13792.   ]})&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13793.  
  13794. aws iam put-role-policy     &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13795.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;          &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13796.  &lt;span class=&quot;nt&quot;&gt;--role-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ROLE1_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13797.  &lt;span class=&quot;nt&quot;&gt;--policy-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;fetch-data-and-update-dynamo&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13798.  &lt;span class=&quot;nt&quot;&gt;--policy-document&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$REV_MAPPER_POLICY&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13799.  
  13800. &lt;p&gt;As new files appear in our bucket under the target prefix, we should see our table being
  13801.  updated accordingly.  Let’s give it a try:&lt;/p&gt;
  13802.  
  13803. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1 12345&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13804.    | aws s3 &lt;span class=&quot;nb&quot;&gt;cp&lt;/span&gt; - s3://&lt;span class=&quot;nv&quot;&gt;$BUCKET&lt;/span&gt;/&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;date&lt;/span&gt; +&lt;span class=&quot;s1&quot;&gt;&apos;%Y-%m-%d&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;/values_i-deadbeef
  13805. &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1 12345&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13806.    | aws s3 &lt;span class=&quot;nb&quot;&gt;cp&lt;/span&gt; - s3://&lt;span class=&quot;nv&quot;&gt;$BUCKET&lt;/span&gt;/&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;date&lt;/span&gt; +&lt;span class=&quot;s1&quot;&gt;&apos;%Y-%m-%d&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;/values_i-beefdead
  13807. &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1 54321
  13808. COUNTER2 54321&quot;&lt;/span&gt; | aws s3 &lt;span class=&quot;nb&quot;&gt;cp&lt;/span&gt; - s3://&lt;span class=&quot;nv&quot;&gt;$BUCKET&lt;/span&gt;/&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;date&lt;/span&gt; +&lt;span class=&quot;s1&quot;&gt;&apos;%Y-%m-%d&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;/values_i-foobazzled&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13809.  
  13810. &lt;p&gt;If we scan our intermediate state table, we should now see that it has been updated by
  13811.  our Lambda function:&lt;/p&gt;
  13812.  
  13813. &lt;pre&gt;
  13814.  
  13815.    aws dynamodb scan    \
  13816.        --region $REGION \
  13817.        --table-name $TABLE
  13818.  
  13819. &lt;/pre&gt;
  13820.  
  13821. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-json&quot; data-lang=&quot;json&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13822.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Count&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13823.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Items&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13824.        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13825.            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Date&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13826.                &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2015-11-16&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13827.            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13828.            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;InstanceValues&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13829.                &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;M&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13830.                    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;i-foobazzled&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13831.                        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;N&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;54321&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13832.                    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13833.                &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13834.            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13835.            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13836.                &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;COUNTER2&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13837.            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13838.        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13839.        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13840.            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Date&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13841.                &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2015-11-16&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13842.            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13843.            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;InstanceValues&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13844.                &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;M&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13845.                    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;i-beefdead&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13846.                        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;N&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;12345&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13847.                    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13848.                    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;i-deadbeef&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13849.                        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;N&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;12345&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13850.                    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13851.                    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;i-foobazzled&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13852.                        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;N&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;54321&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13853.                    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13854.                &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13855.            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13856.            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13857.                &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13858.            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13859.        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13860.    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13861.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;ScannedCount&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13862.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;ConsumedCapacity&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13863. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13864.  
  13865. &lt;p&gt;If you see something like this, it worked.  Huzzah!  We’ve completed the first part of
  13866.  our system:&lt;/p&gt;
  13867.  
  13868. &lt;p&gt;&lt;img src=&quot;/images/post_images/counter-upload-processor-2.png&quot; alt=&quot;updated counter upload processor&quot; /&gt;&lt;/p&gt;
  13869.  
  13870. &lt;h3 id=&quot;enter-dynamodb-streams&quot;&gt;Enter DynamoDB Streams&lt;/h3&gt;
  13871.  
  13872. &lt;p&gt;In our intermediate state table we’re now maintaining an inverted map of the original
  13873.  data uploaded to S3, converting per-instance lists of per-counter values on S3 to
  13874.  per-counter lists of per-instance values in DynamoDB.  Our goal is to sum up all
  13875.  per-instance values for each counter and to take action when these values exceed a
  13876.  per-counter limit, only processing data which has changed.&lt;/p&gt;
  13877.  
  13878. &lt;p&gt;DynamoDB Streams is a &lt;a href=&quot;https://aws.amazon.com/blogs/aws/dynamodb-update-triggers-streams-lambda-cross-region-replication-app/&quot;&gt;recently-released feature&lt;/a&gt;
  13879.  which grants a view of change events on a DynamoDB table (akin to a Kinesis stream).  We
  13880.  can use “DynamoDB Triggers” (the combination of DynamoDB Streams and Lambda functions)
  13881.  to achieve our goal.&lt;/p&gt;
  13882.  
  13883. &lt;p&gt;The current version of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;awscli&lt;/code&gt; tools doesn’t support enabling or modifying DynamoDB
  13884.  stream settings, so for now we must use the console to configure a table stream:&lt;/p&gt;
  13885.  
  13886. &lt;p&gt;&lt;img src=&quot;/images/post_images/enable-dynamo-streams.png&quot; alt=&quot;Enable DynamoDB streams&quot; /&gt;&lt;/p&gt;
  13887.  
  13888. &lt;p&gt;Since we’ll be performing an operation on the entire new value of updated items, select
  13889.  “new image” as the stream view type:&lt;/p&gt;
  13890.  
  13891. &lt;p&gt;&lt;img src=&quot;/images/post_images/dynamo-stream-view-new-image.png&quot; alt=&quot;Configure new DynamoDB stream&quot; /&gt;&lt;/p&gt;
  13892.  
  13893. &lt;p&gt;After configuring the table stream in the DynamoDB console, we can fetch its ARN:&lt;/p&gt;
  13894.  
  13895. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;STREAM_ARN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;aws dynamodbstreams list-streams   &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13896.                 &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;               &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13897.                 &lt;span class=&quot;nt&quot;&gt;--table-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$TABLE&lt;/span&gt;            &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13898.                 &lt;span class=&quot;nt&quot;&gt;--query&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;Streams[0].StreamArn&apos;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13899.                 &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; text&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13900.  
  13901. &lt;h4 id=&quot;putting-it-all-together-again&quot;&gt;Putting it all together (again)&lt;/h4&gt;
  13902.  
  13903. &lt;p&gt;We’ll need another Lambda function to process change events on our state table.  We’ll
  13904.  use the same recipe (and initial code) as our first Lambda function.&lt;/p&gt;
  13905.  
  13906. &lt;p&gt;Here’s how the next stage of our system will look:&lt;/p&gt;
  13907.  
  13908. &lt;p&gt;&lt;img src=&quot;/images/post_images/counter-upload-aggregator-1.png&quot; alt=&quot;counter upload aggregator&quot; /&gt;&lt;/p&gt;
  13909.  
  13910. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;FUNCTION2_NAME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;counter-upload-aggregator&quot;&lt;/span&gt;
  13911. &lt;span class=&quot;nv&quot;&gt;ROLE2_NAME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$FUNCTION2_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;-execute&quot;&lt;/span&gt;
  13912. &lt;span class=&quot;nv&quot;&gt;EXEC_ROLE2_ARN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;aws iam create-role        &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13913.                  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;          &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13914.                  &lt;span class=&quot;nt&quot;&gt;--role-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ROLE2_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13915.                  &lt;span class=&quot;nt&quot;&gt;--assume-role-policy-document&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ASSUMEROLE_POLICY&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13916.                  &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; text &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13917.                  &lt;span class=&quot;nt&quot;&gt;--query&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;Role.Arn&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13918.  
  13919. &lt;span class=&quot;nv&quot;&gt;LOG_GROUP2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;/aws/lambda/&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$FUNCTION2_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  13920. aws logs create-log-group &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13921.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;        &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13922.  &lt;span class=&quot;nt&quot;&gt;--log-group-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$LOG_GROUP2&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  13923.  
  13924. &lt;span class=&quot;nv&quot;&gt;CLOUDWATCH_POLICY2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;import json; print json.dumps(
  13925.  { &apos;Version&apos;: &apos;2012-10-17&apos;,
  13926.    &apos;Statement&apos;: [
  13927.    { &apos;Effect&apos;: &apos;Allow&apos;,
  13928.      &apos;Action&apos;: [&apos;logs:PutLogEvents&apos;,
  13929.                 &apos;logs:CreateLogStream&apos;],
  13930.    &apos;Resource&apos;: &apos;arn:aws:logs:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ACCOUNT&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:log-group:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$LOG_GROUP2&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:*&apos; }
  13931.   ]})&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13932.  
  13933. aws iam put-role-policy     &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13934.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;          &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13935.  &lt;span class=&quot;nt&quot;&gt;--role-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ROLE2_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13936.  &lt;span class=&quot;nt&quot;&gt;--policy-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;emit-cloudwatch-logs&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13937.  &lt;span class=&quot;nt&quot;&gt;--policy-document&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CLOUDWATCH_POLICY2&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  13938.  
  13939. &lt;span class=&quot;nv&quot;&gt;FUNCTION2_ARN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;aws lambda create-function  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13940.                  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;          &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13941.                  &lt;span class=&quot;nt&quot;&gt;--runtime&lt;/span&gt; python2.7       &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13942.                  &lt;span class=&quot;nt&quot;&gt;--role&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$EXEC_ROLE2_ARN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13943.                  &lt;span class=&quot;nt&quot;&gt;--description&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;counter data aggregator&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13944.                  &lt;span class=&quot;nt&quot;&gt;--timeout&lt;/span&gt; 10      &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13945.                  &lt;span class=&quot;nt&quot;&gt;--memory-size&lt;/span&gt; 128 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13946.                  &lt;span class=&quot;nt&quot;&gt;--handler&lt;/span&gt; lambda_function.lambda_handler &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13947.                  &lt;span class=&quot;nt&quot;&gt;--zip-file&lt;/span&gt; fileb://lambda-example-1.zip  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13948.                  &lt;span class=&quot;nt&quot;&gt;--function-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$FUNCTION2_NAME&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13949.                  &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; text &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13950.                  &lt;span class=&quot;nt&quot;&gt;--query&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;FunctionArn&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13951.  
  13952. &lt;p&gt;We need to grant some additional permissions to allow access to our table stream:&lt;/p&gt;
  13953.  
  13954. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;STREAM_POLICY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;import json; print json.dumps(
  13955.  { &apos;Version&apos;: &apos;2012-10-17&apos;,
  13956.    &apos;Statement&apos;: [
  13957.  
  13958.    { &apos;Effect&apos;: &apos;Allow&apos;,
  13959.      &apos;Action&apos;: [&apos;dynamodb:GetRecords&apos;,
  13960.                 &apos;dynamodb:GetShardIterator&apos;,
  13961.                 &apos;dynamodb:DescribeStream&apos;],
  13962.    &apos;Resource&apos;: &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$STREAM_ARN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos; },
  13963.  
  13964.    { &apos;Effect&apos;: &apos;Allow&apos;,
  13965.      &apos;Action&apos;: [&apos;dynamodb:DescribeStreams&apos;],
  13966.    &apos;Resource&apos;: &apos;*&apos; }
  13967.   ]})&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  13968.  
  13969. aws iam put-role-policy     &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13970.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;          &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13971.  &lt;span class=&quot;nt&quot;&gt;--role-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ROLE2_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13972.  &lt;span class=&quot;nt&quot;&gt;--policy-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;dynamodb-stream-access&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13973.  &lt;span class=&quot;nt&quot;&gt;--policy-document&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$STREAM_POLICY&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13974.  
  13975. &lt;p&gt;We can now create an event source mapping between our table stream and our new Lambda
  13976.  function.  This will cause Lambda to poll the DynamoDB stream and execute our new
  13977.  function with events read from it:&lt;/p&gt;
  13978.  
  13979. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;aws lambda create-event-source-mapping &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13980.    &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;                   &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13981.    &lt;span class=&quot;nt&quot;&gt;--event-source-arn&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$STREAM_ARN&lt;/span&gt;     &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13982.    &lt;span class=&quot;nt&quot;&gt;--function-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$FUNCTION2_ARN&lt;/span&gt;     &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13983.    &lt;span class=&quot;nt&quot;&gt;--starting-position&lt;/span&gt; LATEST         &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13984.    &lt;span class=&quot;nt&quot;&gt;--enabled&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13985.  
  13986. &lt;p&gt;Let’s upload another test file.  &lt;em&gt;(Note: there may be a delay after configuring the
  13987.  event source mapping before this upload will result in the second function being
  13988.  triggered.)&lt;/em&gt;&lt;/p&gt;
  13989.  
  13990. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1 123456&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  13991.    | aws s3 &lt;span class=&quot;nb&quot;&gt;cp&lt;/span&gt; - s3://&lt;span class=&quot;nv&quot;&gt;$BUCKET&lt;/span&gt;/&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;date&lt;/span&gt; +&lt;span class=&quot;s1&quot;&gt;&apos;%Y-%m-%d&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;/values_i-deadbeef&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  13992.  
  13993. &lt;p&gt;This upload should cause our first Lambda function to update the intermediate state
  13994.  table.  Our second Lambda function should receive an event like this:&lt;/p&gt;
  13995.  
  13996. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-json&quot; data-lang=&quot;json&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Records&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13997.  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;     &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;awsRegion&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;us-east-1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13998.        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eventName&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;MODIFY&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  13999.   &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eventSourceARN&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;$STREAM_ARN&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14000.      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eventSource&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;aws:dynamodb&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14001.          &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eventID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;SOME_EVENT_ID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14002.     &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;eventVersion&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;1.0&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14003.         &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;dynamodb&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;StreamViewType&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;NEW_IMAGE&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14004.                        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;SequenceNumber&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;SOME_SEQUENCE_NUMBER&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14005.                             &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;SizeBytes&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;125&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14006.                        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Keys&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14007.                            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Date&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2015-11-16&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14008.                         &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14009.                                &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14010.                    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;NewImage&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14011.                      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14012.                         &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Date&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2015-11-16&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14013.               &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;InstanceValues&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;M&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;i-foobazzled&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;N&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;54321&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14014.                                          &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;i-beefdead&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;N&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;12345&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14015.                                          &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;i-deadbeef&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;N&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;123456&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14016.                                 &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14017.                      &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14018.  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14019. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14020.  
  14021. &lt;p&gt;If we repeat the same upload, our second Lambda function should not be executed a second
  14022.  time—our first Lambda function only updates an item if it would result in a new or
  14023.  changed value.  This is good, but it doesn’t quite fulfill our goal of only processing
  14024.  changed data: if a new instance comes online and emits a value of zero for a counter,
  14025.  our second Lambda function will still see an update despite the global value for that
  14026.  counter remaining unchanged.&lt;/p&gt;
  14027.  
  14028. &lt;h3 id=&quot;rinse-and-repeat&quot;&gt;Rinse and Repeat&lt;/h3&gt;
  14029.  
  14030. &lt;p&gt;To fulfill our goal of only acting on changed data, we’ll create a second DynamoDB table
  14031.  to hold actual per-counter aggregate values.  &lt;em&gt;(A third Lambda function monitoring this
  14032.  second table’s event stream could contain the business logic acting on changed counter
  14033.  values; all of our goals would then have been met.)&lt;/em&gt;&lt;/p&gt;
  14034.  
  14035. &lt;p&gt;Here’s how this part of our updated system will look:&lt;/p&gt;
  14036.  
  14037. &lt;p&gt;&lt;img src=&quot;/images/post_images/counter-upload-aggregator-2.png&quot; alt=&quot;updated counter upload aggregator&quot; /&gt;&lt;/p&gt;
  14038.  
  14039. &lt;p&gt;In our second table, we want to store items which look like this:&lt;/p&gt;
  14040.  
  14041. &lt;pre&gt;
  14042.  
  14043.    {    &apos;Counter&apos;: { &apos;S&apos;: &apos;XXXXXXXXXXXXXXXXXXXXXX&apos; },
  14044.            &apos;Date&apos;: { &apos;S&apos;: &apos;YYYY-MM-DD&apos; },
  14045.           &apos;Value&apos;: { &apos;N&apos;: &apos;1234567&apos;    }
  14046.    }
  14047.  
  14048. &lt;/pre&gt;
  14049.  
  14050. &lt;p&gt;Our second DynamoDB table has the following schema.  Including the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ByDate&lt;/code&gt; global
  14051.  secondary index will allow us to query the table for all items having a particular date,
  14052.  which might come in handy later (although it will double our table cost, because all
  14053.  attributes participate in the index):&lt;/p&gt;
  14054.  
  14055. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;TABLE2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;aggregate-counters&quot;&lt;/span&gt;
  14056.  
  14057. aws dynamodb create-table        &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14058.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;               &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14059.  &lt;span class=&quot;nt&quot;&gt;--table-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$TABLE2&lt;/span&gt;           &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14060.  &lt;span class=&quot;nt&quot;&gt;--provisioned-throughput&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ReadCapacityUnits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;10,WriteCapacityUnits&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;10 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14061.  &lt;span class=&quot;nt&quot;&gt;--key-schema&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;AttributeName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Counter,KeyType&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;HASH                     &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14062.               &lt;span class=&quot;nv&quot;&gt;AttributeName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Date,KeyType&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;RANGE                       &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14063.  &lt;span class=&quot;nt&quot;&gt;--attribute-definitions&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;AttributeName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Counter,AttributeType&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;S       &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14064.                          &lt;span class=&quot;nv&quot;&gt;AttributeName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;Date,AttributeType&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;S          &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14065.  &lt;span class=&quot;nt&quot;&gt;--global-secondary-indexes&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;[{
  14066.                 &quot;IndexName&quot;: &quot;ByDate&quot;,
  14067.                 &quot;KeySchema&quot;: [{ &quot;AttributeName&quot;: &quot;Date&quot;,
  14068.                                       &quot;KeyType&quot;: &quot;HASH&quot; }],
  14069.     &quot;ProvisionedThroughput&quot;: { &quot;ReadCapacityUnits&quot;: 1,
  14070.                               &quot;WriteCapacityUnits&quot;: 10 },
  14071.                &quot;Projection&quot;: { &quot;ProjectionType&quot;: &quot;ALL&quot; }
  14072.     }]&apos;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14073.  
  14074. &lt;p&gt;Let’s update our second Lambda function’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lambda_function.py&lt;/code&gt; file:&lt;/p&gt;
  14075.  
  14076. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;boto3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;botocore&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exceptions&lt;/span&gt;
  14077.  
  14078.  
  14079. &lt;span class=&quot;n&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;aggregate-counters&quot;&lt;/span&gt;
  14080.  
  14081.  
  14082. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;lambda_handler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  14083.  &lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getLogger&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;setLevel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;INFO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  14084.  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Records&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
  14085.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;aws:dynamodb&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;eventSource&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; \
  14086.        &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;MODIFY&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;eventName&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;    \
  14087.        &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;NEW_IMAGE&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;dynamodb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;StreamViewType&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
  14088.      &lt;span class=&quot;n&quot;&gt;region&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;awsRegion&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  14089.      &lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;dynamodb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Keys&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  14090.      &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Date&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;S&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  14091.      &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Counter&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;S&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  14092.      &lt;span class=&quot;n&quot;&gt;new_item&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;dynamodb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;NewImage&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  14093.      &lt;span class=&quot;n&quot;&gt;instance_values&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;InstanceValues&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;M&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  14094.      &lt;span class=&quot;n&quot;&gt;total_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;N&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  14095.                        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance_values&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;values&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
  14096.      &lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;updated counter: {} {} {}&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  14097.          &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  14098.      &lt;span class=&quot;c1&quot;&gt;# go! thou art counted:
  14099. &lt;/span&gt;      &lt;span class=&quot;n&quot;&gt;lax_update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;boto3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;dynamodb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;region_name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; \
  14100.                      &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TABLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  14101.                 &lt;span class=&quot;n&quot;&gt;Key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Counter&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  14102.                         &lt;span class=&quot;s&quot;&gt;&apos;Date&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  14103.                 &lt;span class=&quot;n&quot;&gt;UpdateExpression&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;SET #value = :value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  14104.                 &lt;span class=&quot;n&quot;&gt;ExpressionAttributeNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;#value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;Value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  14105.                 &lt;span class=&quot;n&quot;&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;:value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  14106.                 &lt;span class=&quot;n&quot;&gt;ConditionExpression&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;NOT #value = :value&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  14107.  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
  14108.  
  14109.  
  14110. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;lax_update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  14111.  &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  14112.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;update_item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  14113.  &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;botocore&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exceptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ClientError&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  14114.    &lt;span class=&quot;n&quot;&gt;code&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Error&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Code&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  14115.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;ConditionalCheckFailedException&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;code&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  14116.      &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14117.  
  14118. &lt;p&gt;And update our second Lambda function’s code:&lt;/p&gt;
  14119.  
  14120. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;zip &lt;span class=&quot;nt&quot;&gt;-j&lt;/span&gt; lambda-example-3.zip lambda_function.py
  14121.  
  14122. aws lambda update-function-code  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14123.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;               &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14124.  &lt;span class=&quot;nt&quot;&gt;--function-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$FUNCTION2_ARN&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14125.  &lt;span class=&quot;nt&quot;&gt;--zip-file&lt;/span&gt; fileb://lambda-example-3.zip&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14126.  
  14127. &lt;p&gt;As before, we need to grant permission to the execution role used by our second Lambda
  14128.  function so it can update our second DynamoDB table:&lt;/p&gt;
  14129.  
  14130. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;AGGREGATOR_POLICY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;import json; print json.dumps(
  14131.  { &apos;Version&apos;: &apos;2012-10-17&apos;,
  14132.    &apos;Statement&apos;: [
  14133.  
  14134.    { &apos;Effect&apos;: &apos;Allow&apos;,
  14135.      &apos;Action&apos;: [&apos;dynamodb:UpdateItem&apos;],
  14136.    &apos;Resource&apos;: &apos;arn:aws:dynamodb:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ACCOUNT&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:table/&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$TABLE2&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos; }
  14137.   ]})&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  14138.  
  14139. aws iam put-role-policy     &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14140.  &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;          &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14141.  &lt;span class=&quot;nt&quot;&gt;--role-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ROLE2_NAME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14142.  &lt;span class=&quot;nt&quot;&gt;--policy-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;update-dynamo-aggregate-counters&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14143.  &lt;span class=&quot;nt&quot;&gt;--policy-document&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$AGGREGATOR_POLICY&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14144.  
  14145. &lt;p&gt;When we upload new counter data, our second table should be updated with the total
  14146.  summed value:&lt;/p&gt;
  14147.  
  14148. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1 1000&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14149.    | aws s3 &lt;span class=&quot;nb&quot;&gt;cp&lt;/span&gt; - s3://&lt;span class=&quot;nv&quot;&gt;$BUCKET&lt;/span&gt;/&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;date&lt;/span&gt; +&lt;span class=&quot;s1&quot;&gt;&apos;%Y-%m-%d&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;/values_i-flub&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14150.  
  14151. &lt;p&gt;This upload should have triggered the following sequence of events:&lt;/p&gt;
  14152.  
  14153. &lt;ol&gt;
  14154.  &lt;li&gt;
  14155.    &lt;p&gt;Our S3 bucket emits an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ObjectCreated&lt;/code&gt; event to our SNS topic.&lt;/p&gt;
  14156.  &lt;/li&gt;
  14157.  &lt;li&gt;
  14158.    &lt;p&gt;Our topic wraps the S3 event in an SNS notification and delivers it to our first
  14159.  Lambda function (invoking it).&lt;/p&gt;
  14160.  &lt;/li&gt;
  14161.  &lt;li&gt;
  14162.    &lt;p&gt;Our first Lambda function executes and updates our first table.&lt;/p&gt;
  14163.  &lt;/li&gt;
  14164.  &lt;li&gt;
  14165.    &lt;p&gt;An event appears on our first table’s event stream, which (Lambda on behalf of) our
  14166.  second Lambda function has been polling.&lt;/p&gt;
  14167.  &lt;/li&gt;
  14168.  &lt;li&gt;
  14169.    &lt;p&gt;Our second Lambda function executes and updates our second table.&lt;/p&gt;
  14170.  &lt;/li&gt;
  14171. &lt;/ol&gt;
  14172.  
  14173. &lt;p&gt;We should see a value of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;12345+123456+1000+54321 = 191122&lt;/code&gt; in our second table, which
  14174.  is the sum of all uploaded values for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;COUNTER1&lt;/code&gt;.  We haven’t updated any other counter
  14175.  values since setting up our second table and function, so they shouldn’t yet exist in
  14176.  our second table:&lt;/p&gt;
  14177.  
  14178. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;aws dynamodb query        &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14179.    &lt;span class=&quot;nt&quot;&gt;--region&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$REGION&lt;/span&gt;      &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14180.    &lt;span class=&quot;nt&quot;&gt;--table-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$TABLE2&lt;/span&gt;  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14181.    &lt;span class=&quot;nt&quot;&gt;--index-name&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;ByDate&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14182.    &lt;span class=&quot;nt&quot;&gt;--key-condition-expression&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;#date = :date&apos;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14183.    &lt;span class=&quot;nt&quot;&gt;--expression-attribute-names&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;{&quot;#date&quot;: &quot;Date&quot;,
  14184.                                   &quot;#value&quot;: &quot;Value&quot;,
  14185.                                   &quot;#counter&quot;: &quot;Counter&quot;}&apos;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14186.    &lt;span class=&quot;nt&quot;&gt;--expression-attribute-values&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;{&quot;:date&quot;:
  14187.                                     {&quot;S&quot;: &quot;&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;date&lt;/span&gt; +&lt;span class=&quot;s1&quot;&gt;&apos;%Y-%m-%d&apos;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;&quot;}}&apos;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  14188.    &lt;span class=&quot;nt&quot;&gt;--projection-expression&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;#date, #counter, #value&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14189.  
  14190. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-json&quot; data-lang=&quot;json&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14191.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Count&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14192.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Items&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14193.        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14194.            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Date&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14195.                &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2015-11-16&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14196.            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14197.            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Value&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14198.                &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;N&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;191122&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14199.            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14200.            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;Counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14201.                &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;S&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;COUNTER1&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14202.            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14203.        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14204.    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14205.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;ScannedCount&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14206.    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;ConsumedCapacity&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  14207. &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14208.  
  14209. &lt;p&gt;If you see this, it worked!  Our final system:&lt;/p&gt;
  14210.  
  14211. &lt;p&gt;&lt;img src=&quot;/images/post_images/lambda-counter-aggregator-overview.png&quot; alt=&quot;Lambda-based counter aggregation system overview&quot; /&gt;&lt;/p&gt;
  14212.  
  14213. &lt;h2 id=&quot;onward-and-upward&quot;&gt;Onward and Upward&lt;/h2&gt;
  14214.  
  14215. &lt;p&gt;We have successfully implemented a basic system which can count things we care about by
  14216.  reacting appropriately to S3 uploads.  We also eliminated the need to worry about server
  14217.  configuration tools and management!&lt;/p&gt;
  14218.  
  14219. &lt;p&gt;By implementing two simple Python modules and making some configuration calls, we have:&lt;/p&gt;
  14220.  
  14221. &lt;ul&gt;
  14222.  &lt;li&gt;
  14223.    &lt;p&gt;&lt;strong&gt;Decreased coupling between components:&lt;/strong&gt; Instead of a set of cooperating periodic
  14224. tasks, we have some small Lambda functions each doing one simple thing—reacting
  14225. to events.  Each is also easily testable (1) locally with a test suite, (2) in the
  14226. Lambda console, and (3) from the command line using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;awscli&lt;/code&gt;.&lt;/p&gt;
  14227.  &lt;/li&gt;
  14228.  &lt;li&gt;
  14229.    &lt;p&gt;&lt;strong&gt;Improved responsiveness and scalability:&lt;/strong&gt; Our new system can begin reacting to
  14230. updates with less latency than the original, and we can realize significant
  14231. performance gains simply by deciding to spend more money—or less, if our
  14232. requirements decrease—without any need for a redesign.&lt;/p&gt;
  14233.  &lt;/li&gt;
  14234. &lt;/ul&gt;
  14235.  
  14236. &lt;h3 id=&quot;overcoming-failure&quot;&gt;Overcoming Failure&lt;/h3&gt;
  14237.  
  14238. &lt;p&gt;The example system described in this post glosses over certain issues which would be
  14239.  problematic in a real application, and no discussion involving distributed systems
  14240.  would be complete without addressing some of the ways in which they can fail.&lt;/p&gt;
  14241.  
  14242. &lt;p&gt;We expect our system to finish processing each event within 50 seconds.  What if it
  14243.  doesn’t?&lt;/p&gt;
  14244.  
  14245. &lt;p&gt;With the system we’ve implemented here, the most likely failures are:&lt;/p&gt;
  14246.  
  14247. &lt;ol&gt;
  14248.  &lt;li&gt;
  14249.    &lt;p&gt;Incomplete file processing due to reaching the Lambda invocation time limit.  There
  14250.  might be too many counters in each file for us to process even after making improvements
  14251.  to our implementation and paying for more DynamoDB capacity.&lt;/p&gt;
  14252.  &lt;/li&gt;
  14253.  &lt;li&gt;
  14254.    &lt;p&gt;Incomplete file processing due to a provisioned throughput exception.  Our library
  14255.  will perform retries, but only up to a point before giving up and raising an exception.&lt;/p&gt;
  14256.  &lt;/li&gt;
  14257. &lt;/ol&gt;
  14258.  
  14259. &lt;p&gt;Both timing out and raising an exception are treated as execution errors by Lambda and
  14260.  will result in our function being invoked again with the same event some time later (up
  14261.  to a certain number of retry attempts).&lt;/p&gt;
  14262.  
  14263. &lt;p&gt;While the system we’ve described is idempotent—each instance uploads its current
  14264.  set of counter values, not counter deltas—this retry behavior can still be
  14265.  problematic for our system as-written: a failing invocation &lt;em&gt;C&lt;/em&gt; might be followed by a
  14266.  successful invocation &lt;em&gt;D&lt;/em&gt;, which is then followed by a successful retry of &lt;em&gt;C&lt;/em&gt; (“lost
  14267.  update” problem).  Unless we take steps to mitigate this, older data from &lt;em&gt;C&lt;/em&gt; could wipe
  14268.  out newer data from &lt;em&gt;D&lt;/em&gt; if our bucket uses versioning and they correspond to the same
  14269.  uploading instance.  We can handle this by adding a per-instance event timestamp to
  14270.  our first table’s items and including it in our condition expression.&lt;/p&gt;
  14271.  
  14272. &lt;p&gt;We could also encounter a situation where we can never catch up because the incoming
  14273.  data rate exceeds the maximum sustained processing rate.  If this happens after we’ve
  14274.  already made our implementation more efficient and increased our provisioned capacity,
  14275.  we should consider adjusting our schema (i.e., so that we may use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BatchWriteItem&lt;/code&gt;
  14276.  operations on our first table) and/or rethink our system’s design.  &lt;em&gt;(We also control
  14277.  the systems producing this data, so nothing is set in stone!)&lt;/em&gt;&lt;/p&gt;
  14278.  
  14279. &lt;h4 id=&quot;a-word-on-latency&quot;&gt;A word on latency&lt;/h4&gt;
  14280.  
  14281. &lt;p&gt;We listed reaction times earlier in our cost estimate, based on the following system
  14282.  latency formula:&lt;/p&gt;
  14283.  
  14284. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  14285. %&lt;![CDATA[
  14286. \begin{equation}
  14287.  L = u + d_1 + max(p_1, d_2 + p_2)
  14288. \end{equation}
  14289. %]]&gt;
  14290. &lt;/script&gt;
  14291.  
  14292. &lt;table class=&quot;math-legend&quot;&gt;
  14293.  &lt;thead&gt;
  14294.    &lt;tr&gt;
  14295.      &lt;th style=&quot;text-align: right&quot;&gt; &lt;/th&gt;
  14296.      &lt;th style=&quot;text-align: left&quot;&gt; &lt;/th&gt;
  14297.    &lt;/tr&gt;
  14298.  &lt;/thead&gt;
  14299.  &lt;tbody&gt;
  14300.    &lt;tr&gt;
  14301.      &lt;td style=&quot;text-align: right&quot;&gt;\(L\)&lt;/td&gt;
  14302.      &lt;td style=&quot;text-align: left&quot;&gt;latency&lt;/td&gt;
  14303.    &lt;/tr&gt;
  14304.    &lt;tr&gt;
  14305.      &lt;td style=&quot;text-align: right&quot;&gt;\(u\)&lt;/td&gt;
  14306.      &lt;td style=&quot;text-align: left&quot;&gt;time until upload&lt;/td&gt;
  14307.    &lt;/tr&gt;
  14308.    &lt;tr&gt;
  14309.      &lt;td style=&quot;text-align: right&quot;&gt;\(d_1\)&lt;/td&gt;
  14310.      &lt;td style=&quot;text-align: left&quot;&gt;delivery time (first lambda function)&lt;/td&gt;
  14311.    &lt;/tr&gt;
  14312.    &lt;tr&gt;
  14313.      &lt;td style=&quot;text-align: right&quot;&gt;\(p_1\)&lt;/td&gt;
  14314.      &lt;td style=&quot;text-align: left&quot;&gt;processing time (first lambda function)&lt;/td&gt;
  14315.    &lt;/tr&gt;
  14316.    &lt;tr&gt;
  14317.      &lt;td style=&quot;text-align: right&quot;&gt;\(d_2\)&lt;/td&gt;
  14318.      &lt;td style=&quot;text-align: left&quot;&gt;delivery time (second lambda function)&lt;/td&gt;
  14319.    &lt;/tr&gt;
  14320.    &lt;tr&gt;
  14321.      &lt;td style=&quot;text-align: right&quot;&gt;\(p_2\)&lt;/td&gt;
  14322.      &lt;td style=&quot;text-align: left&quot;&gt;processing time (second lambda function)&lt;/td&gt;
  14323.    &lt;/tr&gt;
  14324.  &lt;/tbody&gt;
  14325. &lt;/table&gt;
  14326.  
  14327. &lt;p&gt;We estimated our system’s reaction time by ignoring time until upload, assuming minimal
  14328.  delivery time values (0-0.1s), the same processing time values (50s), and a small fudge
  14329.  factor, giving a range of 0.1-51s.  This ignores two important details: (1) concurrency
  14330.  limits for Lambda (100 concurrent executions in each account region by default) and (2)
  14331.  the sharding of our DynamoDB stream (maximum stream reader invocation concurrency is the
  14332.  number of shards in the stream).&lt;/p&gt;
  14333.  
  14334. &lt;p&gt;Item (2) is most problematic in our system—compared to our first Lambda function,
  14335.  our second function will be invoked more frequently (smaller batches of counter updates)
  14336.  and with less concurrency (limited by number of stream shards); therefore, the
  14337.  assumption we made in our cost estimate about “similar function characteristics” was
  14338.  flawed.&lt;/p&gt;
  14339.  
  14340. &lt;p&gt;Experimentation reveals that even with tuning and implementation improvements, our
  14341.  system’s best total latency is 64s (with 100 file uploads of 100 counters every 60s).
  14342.  If this is due to our second Lambda function taking more than 60s to finish processing
  14343.  all events, this is the “never catch up” situation described earlier.  Updating our
  14344.  second Lambda function to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BatchWriteItem&lt;/code&gt; instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;UpdateItem&lt;/code&gt; requires no
  14345.  schema change and greatly improves performance, bringing total system latency back in
  14346.  line with our expectations.&lt;/p&gt;
  14347.  
  14348. &lt;h2 id=&quot;final-words&quot;&gt;Final words&lt;/h2&gt;
  14349.  
  14350. &lt;p&gt;While no tool is right for every job, I had fun writing this post and I look forward to
  14351.  using Lambda more often in situations where it makes sense to do so.  I hope you found
  14352.  this discussion and example interesting!&lt;/p&gt;
  14353.  
  14354. &lt;p&gt;&lt;strong&gt;Exercises for the reader:&lt;/strong&gt;&lt;/p&gt;
  14355.  
  14356. &lt;ol&gt;
  14357.  &lt;li&gt;
  14358.    &lt;p&gt;Improve the efficiency and implementation of both Lambda functions.  Try using
  14359.  &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;multiprocessing&lt;/code&gt; in the first (but note that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Pool&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Queue&lt;/code&gt; won’t work) and
  14360.  implement exponential backoff/retry for provisioned throughput exception handling.  Use
  14361.  &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BatchWriteItem&lt;/code&gt; in the second Lambda function.&lt;/p&gt;
  14362.  &lt;/li&gt;
  14363.  &lt;li&gt;
  14364.    &lt;p&gt;Implement a third Lambda function using the recipes in this post which watches our
  14365.  second DynamoDB table for per-counter updates, and which emits the summed values for all
  14366.  of a particular day’s counters back to S3 when any counter’s summed value changes.&lt;/p&gt;
  14367.  &lt;/li&gt;
  14368. &lt;/ol&gt;
  14369.  
  14370. &lt;p&gt;&lt;strong&gt;Want to learn more about AdRoll? &lt;a href=&quot;https://www.adroll.com/about/careers/&quot;&gt;Roll with Us&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  14371.  
  14372. </description>
  14373.    </item>
  14374.    
  14375.    
  14376.    
  14377.    <item>
  14378.      <title>Rollup: How we use React.js and npm to share UI code at AdRoll</title>
  14379.      <link>https://tech.nextroll.com/blog/frontend/2015/11/12/rollup-react-and-npm-at-adroll.html</link>
  14380.      <pubDate>Thu, 12 Nov 2015 00:00:00 -0800</pubDate>
  14381.      <author></author>
  14382.      <guid isPermaLink="false">https://tech.nextroll.com/blog/frontend/2015/11/12/rollup-react-and-npm-at-adroll</guid>
  14383.      <description>&lt;p&gt;&lt;em&gt;This is the second in a series of three blog posts about Rollup, AdRoll’s UI component library. This post covers how we build individual components and the developer tools supporting them. For background on why we built a UI component library, see &lt;a href=&quot;/blog/frontend/2015/11/05/rollup-shared-ui-components.html&quot;&gt;last week’s post&lt;/a&gt;. For a discussion on what we learned from building Rollup see &lt;a href=&quot;/blog/frontend/2015/11/19/rollup-major-learnings.html&quot;&gt;next week’s post&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
  14384.  
  14385. &lt;p&gt;All the shared UI components we use at AdRoll live in a private GitHub repository called Rollup. In the repo, each component has its own directory whose name matches the component’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; package name. All components, and by extension Rollup &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; package names, are prefixed with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-&lt;/code&gt; for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AdRoll&lt;/code&gt;.&lt;/p&gt;
  14386.  
  14387. &lt;p&gt;The repo also contains developer tools to help maintain components. Some developer tools simply need a configuration file, while others may be more complicated and will have their own directory in the repo. Directories for developer tools are not prefixed with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-&lt;/code&gt; and so can easily be told apart from the component directories.&lt;/p&gt;
  14388.  
  14389. &lt;p&gt;Now that we’ve covered the overall structure of the repo, we’re going to talk in more detail about the following:&lt;/p&gt;
  14390.  
  14391. &lt;ul&gt;
  14392.  &lt;li&gt;&lt;a href=&quot;#rollup-components&quot;&gt;Rollup components&lt;/a&gt;
  14393.    &lt;ul&gt;
  14394.      &lt;li&gt;&lt;a href=&quot;#src&quot;&gt;src/&lt;/a&gt;
  14395.        &lt;ul&gt;
  14396.          &lt;li&gt;&lt;a href=&quot;#jsx-to-scss-file-ratio&quot;&gt;JSX to SCSS file ratio&lt;/a&gt;&lt;/li&gt;
  14397.          &lt;li&gt;&lt;a href=&quot;#importing-the-component-in-another-application&quot;&gt;Importing the component in another application&lt;/a&gt;&lt;/li&gt;
  14398.          &lt;li&gt;&lt;a href=&quot;#es6-transpilation-in-other-applications&quot;&gt;ES6 transpilation in other applications&lt;/a&gt;&lt;/li&gt;
  14399.        &lt;/ul&gt;
  14400.      &lt;/li&gt;
  14401.      &lt;li&gt;&lt;a href=&quot;#dist&quot;&gt;dist/&lt;/a&gt;&lt;/li&gt;
  14402.    &lt;/ul&gt;
  14403.  &lt;/li&gt;
  14404.  &lt;li&gt;&lt;a href=&quot;#rollup-developer-tools&quot;&gt;Rollup developer tools&lt;/a&gt;
  14405.    &lt;ul&gt;
  14406.      &lt;li&gt;&lt;a href=&quot;#developer-tools-for-rollup-contributors&quot;&gt;Developer tools for Rollup contributors&lt;/a&gt;
  14407.        &lt;ul&gt;
  14408.          &lt;li&gt;&lt;a href=&quot;#automated-testing&quot;&gt;Automated testing&lt;/a&gt;&lt;/li&gt;
  14409.          &lt;li&gt;&lt;a href=&quot;#live-component-examples&quot;&gt;Live component examples&lt;/a&gt;&lt;/li&gt;
  14410.          &lt;li&gt;&lt;a href=&quot;#new-component-generator&quot;&gt;New component generator&lt;/a&gt;&lt;/li&gt;
  14411.        &lt;/ul&gt;
  14412.      &lt;/li&gt;
  14413.      &lt;li&gt;&lt;a href=&quot;#developer-tools-for-rollup-users&quot;&gt;Developer tools for Rollup users&lt;/a&gt;
  14414.        &lt;ul&gt;
  14415.          &lt;li&gt;&lt;a href=&quot;#automatic-documentation-generation&quot;&gt;Automatic documentation generation&lt;/a&gt;&lt;/li&gt;
  14416.          &lt;li&gt;&lt;a href=&quot;#component-asset-bundles&quot;&gt;Component asset bundles&lt;/a&gt;&lt;/li&gt;
  14417.          &lt;li&gt;&lt;a href=&quot;#example-applications&quot;&gt;Example applications&lt;/a&gt;&lt;/li&gt;
  14418.        &lt;/ul&gt;
  14419.      &lt;/li&gt;
  14420.    &lt;/ul&gt;
  14421.  &lt;/li&gt;
  14422. &lt;/ul&gt;
  14423.  
  14424. &lt;h2 id=&quot;rollup-components&quot;&gt;Rollup components&lt;/h2&gt;
  14425.  
  14426. &lt;p&gt;Components are published as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; packages  to a private &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; registry set up using &lt;a href=&quot;http://www.jfrog.com/open-source/&quot;&gt;Artifactory&lt;/a&gt;, and the compiled code is made available via AdRoll’s CDN on &lt;a href=&quot;https://aws.amazon.com/s3/&quot;&gt;S3&lt;/a&gt;. The version of each component is updated according to &lt;a href=&quot;http://semver.org/&quot;&gt;Semver&lt;/a&gt; and version-specific CDN URLs for components look like the following:&lt;/p&gt;
  14427.  
  14428. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&amp;lt;cdn_url&amp;gt;/rollup/ar-component/&amp;lt;version_number&amp;gt;/&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14429.  
  14430. &lt;p&gt;Components are written as &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import&quot;&gt;ES6 modules&lt;/a&gt;. UI components in Rollup (not utility modules like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i18n&lt;/code&gt;) are written in &lt;a href=&quot;http://facebook.github.io/react/&quot;&gt;React&lt;/a&gt; and rely heavily on &lt;a href=&quot;http://react-bootstrap.github.io/&quot;&gt;React Bootstrap&lt;/a&gt;. Rollup components that require styling have styles written in &lt;a href=&quot;http://sass-lang.com/&quot;&gt;Sass&lt;/a&gt;. React is not a requirement for building UI components in the Rollup repo but it is highly encouraged. We chose this framework over others for the following reasons:&lt;/p&gt;
  14431.  
  14432. &lt;ul&gt;
  14433.  &lt;li&gt;React components do not require separate template files, thanks to &lt;a href=&quot;https://facebook.github.io/react/docs/jsx-in-depth.html&quot;&gt;JSX&lt;/a&gt;&lt;/li&gt;
  14434.  &lt;li&gt;Each component defines its interface using &lt;a href=&quot;https://facebook.github.io/react/docs/reusable-components.html&quot;&gt;React propTypes&lt;/a&gt;&lt;/li&gt;
  14435.  &lt;li&gt;Component rendering is deterministic based on component props / state instead of direct DOM manipulation&lt;/li&gt;
  14436. &lt;/ul&gt;
  14437.  
  14438. &lt;p&gt;Each component has three directories:&lt;/p&gt;
  14439.  
  14440. &lt;ul&gt;
  14441.  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt;&lt;/li&gt;
  14442.  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt;&lt;/li&gt;
  14443.  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;examples/&lt;/code&gt; (covered in &lt;a href=&quot;#developer-tools-for-rollup-contributors&quot;&gt;Developer tools for Rollup contributors&lt;/a&gt;)&lt;/li&gt;
  14444. &lt;/ul&gt;
  14445.  
  14446. &lt;p&gt;In this section we’re going to talk about the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; directories, using our data table component as an example.&lt;/p&gt;
  14447.  
  14448. &lt;center&gt;
  14449. &lt;img alt=&quot;Rollup Data Table&quot; src=&quot;/images/post_images/rollup_data_table.png&quot; /&gt;
  14450. &lt;/center&gt;
  14451.  
  14452. &lt;p&gt;We’re also going to link to equivalent files in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;react-component-skeleton&lt;/code&gt; component we have published &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/tree/9a6f1e7df1965b342066ca029cc8cce0a3f41295&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  14453.  
  14454. &lt;h3 id=&quot;src&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt;&lt;/h3&gt;
  14455.  
  14456. &lt;p&gt;The source code for all components (regardless of type, UI component vs. utility module) lives in each component’s &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/tree/9a6f1e7df1965b342066ca029cc8cce0a3f41295/src&quot;&gt;src/ directory&lt;/a&gt;. This directory is included in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; package as well as made available via AdRoll’s CDN.&lt;/p&gt;
  14457.  
  14458. &lt;p&gt;We’re going to go over the following three aspects of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt; directory:&lt;/p&gt;
  14459.  
  14460. &lt;ul&gt;
  14461.  &lt;li&gt;&lt;a href=&quot;#jsx-to-scss-file-ratio&quot;&gt;JSX to SCSS file ratio&lt;/a&gt;&lt;/li&gt;
  14462.  &lt;li&gt;&lt;a href=&quot;#importing-the-component-in-another-application&quot;&gt;Importing the component in another application&lt;/a&gt;&lt;/li&gt;
  14463.  &lt;li&gt;&lt;a href=&quot;#es6-transpilation-in-other-applications&quot;&gt;ES6 transpilation in other applications&lt;/a&gt;&lt;/li&gt;
  14464. &lt;/ul&gt;
  14465.  
  14466. &lt;h4 id=&quot;jsx-to-scss-file-ratio&quot;&gt;JSX to SCSS file ratio&lt;/h4&gt;
  14467.  
  14468. &lt;p&gt;Component JavaScript is written as an ES6 module and lives in &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/tree/9a6f1e7df1965b342066ca029cc8cce0a3f41295/src/components&quot;&gt;src/components&lt;/a&gt;, while component styles are written in Sass and live in &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/tree/9a6f1e7df1965b342066ca029cc8cce0a3f41295/src/styles&quot;&gt;src/styles&lt;/a&gt;.&lt;/p&gt;
  14469.  
  14470. &lt;p&gt;Components may have as many style sheets or React components as the authors like, but only the &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/src/components/ReactComponentSkeleton.jsx&quot;&gt;.jsx&lt;/a&gt; and &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/src/styles/ar-react-component-skeleton.scss&quot;&gt;.scss&lt;/a&gt; files that share the same name as the component are intended for external use. Other files are intended for internal use only. For example, consider our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-data-table&lt;/code&gt; component that has the following &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.jsx&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.scss&lt;/code&gt; files in its &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src&lt;/code&gt; directory:&lt;/p&gt;
  14471.  
  14472. &lt;center&gt;
  14473. &lt;img alt=&quot;Rollup Data Table src/&quot; src=&quot;/images/post_images/rollup_data_table_src.png&quot; /&gt;
  14474. &lt;/center&gt;
  14475.  
  14476. &lt;p&gt;As you’ll notice above, we have a 1:1 correlation of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.jsx&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.scss&lt;/code&gt; files (with the exception of our Sass variables file, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_ar-data-table-vars.scss&lt;/code&gt;). For Rollup developers this helps keep files organized, and it’s also obvious where the styling for a given React component lives.&lt;/p&gt;
  14477.  
  14478. &lt;p&gt;All CSS classes in our stylesheets are also prefixed with the name of the component to avoid name-spacing conflicts with applications using the components. E.g. the class given to rows in the data table is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-data-table-row&lt;/code&gt;. This way of organizing stylesheets was inspired by &lt;a href=&quot;https://gist.github.com/bobbygrace/9e961e8982f42eb91b80&quot;&gt;Trello’s CSS Guide&lt;/a&gt;.&lt;/p&gt;
  14479.  
  14480. &lt;h4 id=&quot;importing-the-component-in-another-application&quot;&gt;Importing the component in another application&lt;/h4&gt;
  14481.  
  14482. &lt;p&gt;Having multiple files for one Rollup component like we do above aims to simplify the lives of Rollup developers. But what about developers integrating Rollup components into their projects and applications? What are they supposed to do with all of those files?&lt;/p&gt;
  14483.  
  14484. &lt;p&gt;We want to make it easy for developers to integrate these components into their application. So we have it set up so that consuming developers only need to include one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.jsx&lt;/code&gt; and one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.scss&lt;/code&gt; file to fully integrate the component code and styles into their application.&lt;/p&gt;
  14485.  
  14486. &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; package for Rollup components’ &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/package.json#L6&quot;&gt;main file points to the React component that has the same name&lt;/a&gt;. For example, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;package.json&lt;/code&gt; of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-data-table&lt;/code&gt; has a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main&lt;/code&gt; field that points to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-data-table/src/DataTable.jsx&lt;/code&gt;. So importing the component is as easy as:&lt;/p&gt;
  14487.  
  14488. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-js&quot; data-lang=&quot;js&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;DataTable&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;ar-data-table&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14489.  
  14490. &lt;p&gt;In terms of documenting the component, all props required by the components and its subcomponents will be documented in the top-level component (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DataTable.jsx&lt;/code&gt; in our example). We do this even if the top-level React component doesn’t use that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prop&lt;/code&gt; directly. This reduces the number of files developers would need to look at if they were trying to figure out what type of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;props&lt;/code&gt; a component expects.&lt;/p&gt;
  14491.  
  14492. &lt;p&gt;We also want to make it easy to import component styles and handle static assets. To that end, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-data-table.scss&lt;/code&gt; imports all stylesheets associated with subcomponents, SASS variables, and mixins. But what about static assets? How do we make that easy for Rollup consumers?&lt;/p&gt;
  14493.  
  14494. &lt;p&gt;Say you want to use an image for a Rollup component that you are developing. In a stylesheet, it might look like:&lt;/p&gt;
  14495.  
  14496. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-scss&quot; data-lang=&quot;scss&quot;&gt;&lt;span class=&quot;nc&quot;&gt;.class-with-background&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  14497.    &lt;span class=&quot;nl&quot;&gt;background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;sx&quot;&gt;url(&apos;../../images/background-image.png&apos;)&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  14498. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14499.  
  14500. &lt;p&gt;However, the relative path above becomes a problem when you try to use the component in another project. The relative path works locally when developing a Rollup component, but is not guaranteed to work in other project setups depending on their directory structures. For that reason, we do not publish static assets (e.g. images and fonts) in the component’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; package. Rather, we push all statics assets to our CDN and replace all URLs for static assets in our stylesheets with the following:&lt;/p&gt;
  14501.  
  14502. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-scss&quot; data-lang=&quot;scss&quot;&gt;&lt;span class=&quot;nc&quot;&gt;.class-with-background&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  14503.    &lt;span class=&quot;nl&quot;&gt;background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;sx&quot;&gt;url(&apos;/* @echo STATIC_ASSETS_URL*/background-image.png&apos;)&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  14504. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14505.  
  14506. &lt;p&gt;We then use &lt;a href=&quot;https://www.npmjs.com/package/gulp-preprocess&quot;&gt;gulp-preprocess&lt;/a&gt;. Gulp-preprocess gives us the ability to replace occurrences of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/* @echo STATIC_ASSETS_URL */&lt;/code&gt; with relative paths for local developemt and with absolute paths (ponting to the corresponding static assets on our CDN) for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; or publication build. The absolute URLs that are published in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; package, means that component users do not need to worry about the relative paths that would otherwise be in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/styles/&lt;/code&gt; files.&lt;/p&gt;
  14507.  
  14508. &lt;p&gt;One side-effect of the preprocessing, is that component users need to load the stylesheets from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt;. They still only have to load one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.scss&lt;/code&gt; file in order to include all the styles needed for the component, but instead of loading them from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt; the import looks like the following:&lt;/p&gt;
  14509.  
  14510. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-scss&quot; data-lang=&quot;scss&quot;&gt;&lt;span class=&quot;k&quot;&gt;@import&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;ar-data-table/dist/scss/ar-data-table&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14511.  
  14512. &lt;h4 id=&quot;es6-transpilation-in-other-applications&quot;&gt;ES6 transpilation in other applications&lt;/h4&gt;
  14513.  
  14514. &lt;p&gt;Now let’s go over how we enable Rollup consumers to deal with components written in ES6. We use &lt;a href=&quot;https://github.com/babel/babelify&quot;&gt;babelify&lt;/a&gt; to transpile the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt; ES6 code into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; code. Currently, the assumption is that other teams are also using &lt;a href=&quot;http://browserify.org/&quot;&gt;browserify&lt;/a&gt; to compile their production code. So in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; package for each Rollup component we have &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/package.json#L18-L22&quot;&gt;added the following to its package.json&lt;/a&gt;:&lt;/p&gt;
  14515.  
  14516. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;s2&quot;&gt;&quot;browserify&quot;&lt;/span&gt;: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  14517.    &lt;span class=&quot;s2&quot;&gt;&quot;transform&quot;&lt;/span&gt;: &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;
  14518.        &lt;span class=&quot;s2&quot;&gt;&quot;babelify&quot;&lt;/span&gt;
  14519.    &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
  14520. &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14521.  
  14522. &lt;p&gt;…and we have also listed &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;babelify&lt;/code&gt; as a &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/package.json#L24&quot;&gt;peer dependency&lt;/a&gt; so the browserify transform plugin will be &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm install&lt;/code&gt;-ed at the root of the parent application. This makes it so that other projects and applications using Rollup components do not need to setup the ES6 → ES5 transpilation step themselves, provided they’re also using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;browserify&lt;/code&gt;.&lt;/p&gt;
  14523.  
  14524. &lt;h3 id=&quot;dist&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt;&lt;/h3&gt;
  14525.  
  14526. &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; directory is the component’s CDN build. &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/tree/9a6f1e7df1965b342066ca029cc8cce0a3f41295/dist&quot;&gt;This folder&lt;/a&gt; contains:&lt;/p&gt;
  14527.  
  14528. &lt;ul&gt;
  14529.  &lt;li&gt;the component &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.scss&lt;/code&gt; files with absolute URLs we can load static assets from the CDN&lt;/li&gt;
  14530.  &lt;li&gt;an ES5 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.js&lt;/code&gt; file for each component&lt;/li&gt;
  14531.  &lt;li&gt;CSS for the component&lt;/li&gt;
  14532. &lt;/ul&gt;
  14533.  
  14534. &lt;p&gt;The build step that produces the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.js&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.css&lt;/code&gt; files in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; allows Rollup developers to implement the components in a build-system and framework-agnostic way. This allows components to be compatible with a variety of project setups (differing build systems, or lack of a build system completely) and makes them accessible to all engineering teams, regardless of what type of frontend resources they have or build system they are (or are not) using.&lt;/p&gt;
  14535.  
  14536. &lt;p&gt;So that Rollup components can be used easily using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; code, components are attached to a global Rollup namespace. For example, a data table component would be accessible to a developer loading the compiled &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.js&lt;/code&gt; file via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Rollup.DataTable&lt;/code&gt;.&lt;/p&gt;
  14537.  
  14538. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-js&quot; data-lang=&quot;js&quot;&gt;&lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;properties&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;cm&quot;&gt;/* ... */&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  14539.  
  14540. &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;render&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  14541.    &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;createElement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Rollup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;DataTable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;properties&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  14542.    &lt;span class=&quot;nb&quot;&gt;document&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;getElementById&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;data-table-container&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  14543. &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14544.  
  14545. &lt;p&gt;We use &lt;a href=&quot;http://browserify.org/&quot;&gt;browserify&lt;/a&gt; and &lt;a href=&quot;https://babeljs.io&quot;&gt;Babel&lt;/a&gt; to compile our distribution JavaScript from a &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/src/dist.js&quot;&gt;dist.js entry file&lt;/a&gt;. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist.js&lt;/code&gt; file for our data table component looks like this:&lt;/p&gt;
  14546.  
  14547. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-js&quot; data-lang=&quot;js&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;DataTable&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;../.&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  14548.  
  14549. &lt;span class=&quot;nb&quot;&gt;window&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Rollup&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;window&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Rollup&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{};&lt;/span&gt;
  14550. &lt;span class=&quot;nb&quot;&gt;window&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Rollup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;DataTable&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;DataTable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14551.  
  14552. &lt;p&gt;The component CDN build does not include external dependencies, and all dependencies are &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/src/package.json#L5-L9&quot;&gt;assumed to exist in the global scope&lt;/a&gt;.&lt;/p&gt;
  14553.  
  14554. &lt;p&gt;The CSS in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; folder is compiled from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gulp-preprocess&lt;/code&gt;ed Sass using &lt;a href=&quot;https://www.npmjs.com/package/node-sass&quot;&gt;node-sass&lt;/a&gt;.&lt;/p&gt;
  14555.  
  14556. &lt;h2 id=&quot;rollup-developer-tools&quot;&gt;Rollup developer tools&lt;/h2&gt;
  14557.  
  14558. &lt;p&gt;Each component has quite a lot going on inside of its directories. Now imagine trying to maintain many components, and also make it easy for other developers to find their way around using several of them. To ease the pain of these tasks, we built developer tools as needed. These tools live in the Rollup repo alongside the components.&lt;/p&gt;
  14559.  
  14560. &lt;p&gt;As we added more and more components, we wanted to accomplish the following:&lt;/p&gt;
  14561.  
  14562. &lt;ul&gt;
  14563.  &lt;li&gt;reduce the number of steps required for initial component setup&lt;/li&gt;
  14564.  &lt;li&gt;reduce the number of things we needed to remember and worry about while reviewing changes to existing components&lt;/li&gt;
  14565.  &lt;li&gt;maintain component stability across version updates&lt;/li&gt;
  14566.  &lt;li&gt;improve the documentation and tooling for those teams interested in using Rollup components&lt;/li&gt;
  14567. &lt;/ul&gt;
  14568.  
  14569. &lt;p&gt;To address these issues the Rollup repo has the following tools:&lt;/p&gt;
  14570.  
  14571. &lt;ul&gt;
  14572.  &lt;li&gt;&lt;a href=&quot;#automated-testing&quot;&gt;automated testing&lt;/a&gt;&lt;/li&gt;
  14573.  &lt;li&gt;&lt;a href=&quot;#live-component-examples&quot;&gt;live component examples&lt;/a&gt;&lt;/li&gt;
  14574.  &lt;li&gt;&lt;a href=&quot;#new-component-generator&quot;&gt;new component generator&lt;/a&gt;&lt;/li&gt;
  14575.  &lt;li&gt;&lt;a href=&quot;#automatic-documentation-generation&quot;&gt;automatic documentation generation&lt;/a&gt;&lt;/li&gt;
  14576.  &lt;li&gt;&lt;a href=&quot;#component-asset-bundles&quot;&gt;component asset bundles&lt;/a&gt;&lt;/li&gt;
  14577.  &lt;li&gt;&lt;a href=&quot;#example-applications&quot;&gt;example applications&lt;/a&gt;&lt;/li&gt;
  14578. &lt;/ul&gt;
  14579.  
  14580. &lt;p&gt;With that in mind, we’re going to talk about two categories of developer tools:&lt;/p&gt;
  14581.  
  14582. &lt;ul&gt;
  14583.  &lt;li&gt;&lt;a href=&quot;#developer-tools-for-rollup-contributors&quot;&gt;developer tools for Rollup contributors&lt;/a&gt;&lt;/li&gt;
  14584.  &lt;li&gt;&lt;a href=&quot;#developer-tools-for-rollup-users&quot;&gt;developer tools for Rollup users&lt;/a&gt;&lt;/li&gt;
  14585. &lt;/ul&gt;
  14586.  
  14587. &lt;h3 id=&quot;developer-tools-for-rollup-contributors&quot;&gt;Developer tools for Rollup contributors&lt;/h3&gt;
  14588.  
  14589. &lt;p&gt;One set of developer tools was built to help Rollup contributors focus on the important things when adding new components, and also when reviewing changes to existing components. These developer tools were also put in place to help maintain the stability of components across version updates.&lt;/p&gt;
  14590.  
  14591. &lt;h4 id=&quot;automated-testing&quot;&gt;Automated testing&lt;/h4&gt;
  14592.  
  14593. &lt;p&gt;One of the things we believe to be important is stability and consistency across the repo. For stability, we set up repo-wide infrastructure for &lt;a href=&quot;https://facebook.github.io/jest/&quot;&gt;jest&lt;/a&gt; tests. For consistency we added a linter to the Rollup repo using &lt;a href=&quot;http://eslint.org/&quot;&gt;eslint&lt;/a&gt;.&lt;/p&gt;
  14594.  
  14595. &lt;p&gt;The linting and testing can be run individually for each component, but they are also run on every pull request by a &lt;a href=&quot;https://jenkins-ci.org/&quot;&gt;Jenkins&lt;/a&gt; “PR Builder” job that updates the GitHub status of the most recently pushed commit. The jenkins job also checks to make sure the automatically generated component documentation up to date.&lt;/p&gt;
  14596.  
  14597. &lt;p&gt;The tests and linter were put in place to help reviewers and code authors. The linter and test suite help the reviewer by allowing them focus on the important things. In other words, the reviewer can focus on the architectural and functional changes, and not have to worry about code style. With &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jest&lt;/code&gt; tests that test the behavior of a component when given certain &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;props&lt;/code&gt;, the reviewer and author can also be sure that internal code changes do not affect the component interface. It’s an added bonus that these tests are run automatically, so neither the reviewer or author need to locally run all the checks that Jenkins runs, and better yet they do not need to worry about remembering to.&lt;/p&gt;
  14598.  
  14599. &lt;h4 id=&quot;live-component-examples&quot;&gt;Live component examples&lt;/h4&gt;
  14600.  
  14601. &lt;p&gt;In addition to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; directories, each component also has an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;examples/&lt;/code&gt; directory. The examples directory has three files:&lt;/p&gt;
  14602.  
  14603. &lt;ul&gt;
  14604.  &lt;li&gt;&lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/examples/example.jsx&quot;&gt;example.jsx&lt;/a&gt;: imports the component code&lt;/li&gt;
  14605.  &lt;li&gt;&lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/examples/example.scss&quot;&gt;example.scss&lt;/a&gt;: imports the component styles&lt;/li&gt;
  14606.  &lt;li&gt;&lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/examples/index.html&quot;&gt;index.html&lt;/a&gt;: loads the example code after it’s been compiled.&lt;/li&gt;
  14607. &lt;/ul&gt;
  14608.  
  14609. &lt;p&gt;Every component has a &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/gulpfile.js#L25&quot;&gt;gulp dev&lt;/a&gt; task that compiles the example JavaScript and CSS. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gulp dev&lt;/code&gt; task also sets up a local server that loads &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;index.html&lt;/code&gt; and live reloads the page when there are any changes to the example or the component &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt; code.&lt;/p&gt;
  14610.  
  14611. &lt;center&gt;
  14612. &lt;img alt=&quot;Rollup Data Table live example&quot; src=&quot;/images/post_images/rollup_data_table_example.png&quot; /&gt;
  14613. &lt;/center&gt;
  14614.  
  14615. &lt;p&gt;The live component examples help the reviewer and author by giving them an easy way to interact with the component. Both the reviewer and author can interact with the component, without having to worry about how to work with an example implementation of the code. The author or contributor also is given an isolated environment in which they can test their changes during development.&lt;/p&gt;
  14616.  
  14617. &lt;h4 id=&quot;new-component-generator&quot;&gt;New component generator&lt;/h4&gt;
  14618.  
  14619. &lt;p&gt;This tool was put in place to simplify the process of creating a new component. Based on what you’ve read about the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt; directories, it’s clear that there are several configuration files needed in order for all the assets to be built and for the component to eventually be published. Also, these files and configuration options are almost identical across components.&lt;/p&gt;
  14620.  
  14621. &lt;p&gt;As we added more components to the repo, it seemed tedious to create very similar files and directory structures over and over again. Also, with the number of files and configs that are needed, the initial setup phase took about an hour. Not to mention, that it’s easy to forget some of the files and configuration options as you go.&lt;/p&gt;
  14622.  
  14623. &lt;p&gt;So we created a new component generator using &lt;a href=&quot;http://yeoman.io/&quot;&gt;Yeoman&lt;/a&gt;. This generator simply asks for the name of the component you would like to create, validates that it starts with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ar-&lt;/code&gt; (per our convention for Rollup components), and then creates all the necessary files and directories for you. The generator also &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm install&lt;/code&gt;’s some dependencies that are commonly needed for Rollup component development. For example, React, React Bootstrap, babelify, and others. The generator even adds a failing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jest&lt;/code&gt; test, to encourage developers to add tests for stability (see &lt;a href=&quot;#automated-testing&quot;&gt;Automated Testing&lt;/a&gt; for more information) from the get go.&lt;/p&gt;
  14624.  
  14625. &lt;p&gt;At the end of the component generation, the developer has &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/tree/9a6f1e7df1965b342066ca029cc8cce0a3f41295&quot;&gt;a skeleton&lt;/a&gt; with all the right files they need to develop and even publish the component later. &lt;a href=&quot;http://gulpjs.com/&quot;&gt;Gulp&lt;/a&gt; tasks are already defined for the developer, and the component “entry” &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/src/styles/ar-react-component-skeleton.scss&quot;&gt;.scss&lt;/a&gt; and &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/src/components/ReactComponentSkeleton.jsx&quot;&gt;.jsx&lt;/a&gt; files are created as well. The &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/package.json&quot;&gt;package.json&lt;/a&gt; for the component even already has a &lt;a href=&quot;https://github.com/AdRoll/react-component-skeleton/blob/9a6f1e7df1965b342066ca029cc8cce0a3f41295/package.json#L6&quot;&gt;main field&lt;/a&gt; pointing to the right &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.jsx&lt;/code&gt; file in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/&lt;/code&gt;. The idea here is that the developer can hit the ground running. Ideally, this will also reduce the barrier to entry for new Rollup contributors.&lt;/p&gt;
  14626.  
  14627. &lt;p&gt;This generator not only cuts down on our development time by removing that one hour needed just to get a new component off the ground, but it also cut down on our review time. Reviewers no longer needed to worry that all files for new components were added manually by the author.&lt;/p&gt;
  14628.  
  14629. &lt;h2 id=&quot;developer-tools-for-rollup-users&quot;&gt;Developer tools for Rollup users&lt;/h2&gt;
  14630.  
  14631. &lt;p&gt;The other half of our developer tools were put in place to make it easy for teams to use Rollup components, regardless of what build system they are (or are not) using, and how comfortable they are with frontend work.&lt;/p&gt;
  14632.  
  14633. &lt;h4 id=&quot;automatic-documentation-generation&quot;&gt;Automatic documentation generation&lt;/h4&gt;
  14634.  
  14635. &lt;p&gt;In order to make it easier for users of Rollup to tell what &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;props&lt;/code&gt; components expect and what each &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prop&lt;/code&gt; is responsible for (without them having to dig into the source code) we have set up a repo-level gulp task that automatically generates component &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prop&lt;/code&gt; documentation based on the component &lt;a href=&quot;https://facebook.github.io/react/docs/reusable-components.html#prop-validation&quot;&gt;prop validation&lt;/a&gt; and the comments above them.&lt;/p&gt;
  14636.  
  14637. &lt;p&gt;Using &lt;a href=&quot;https://www.npmjs.com/package/react-docgen&quot;&gt;react-docgen&lt;/a&gt; we can generate a json file that represents the prop types each component expects. With the json file output by react-docgen and a little &lt;a href=&quot;http://handlebarsjs.com/&quot;&gt;handlebars&lt;/a&gt; magic we can go from this:&lt;/p&gt;
  14638.  
  14639. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-js&quot; data-lang=&quot;js&quot;&gt;&lt;span class=&quot;cm&quot;&gt;/**
  14640. * The columns you want the data table to have. Each column can have the
  14641. * following attributes:
  14642. * - `key` **(required)**: column identifier
  14643. * - `label` **(required)**: Display text for the column. Should already
  14644. *   be translated when passed to the DataTable.
  14645. * - `accessor` **(required)**: Function that returns the relevant value
  14646. *   from a given data item. Later passed to the column `render`
  14647. *   function.
  14648. * - `render` **(required)**: Function that takes the output of the
  14649. *   `accessor` and returns what should be rendered for a given data item
  14650. *   in that column. Should return either a formatted value or can also
  14651. *   be html. Columns without `render` functions will not be displayed
  14652. *   but can be used for filtering (see the `filters` prop for more
  14653. *   information).
  14654. * - `textAlignment`: Column is center-aligned by default. Use
  14655. *   `DataTable.TEXT_ALIGN_LEFT` or `DataTable.TEXT_ALIGN_RIGHT` to
  14656. * - `widthMultiplier`: Number to multiply the width of the column
  14657. *   relative to other columns. By default, columns are of equal width.
  14658. * - `adminOnly`: Whether or not this is an admin-only or
  14659. *   permission-gated column. `adminOnly` columns will only be shown if
  14660. *   the table&apos;s `isAdmin` prop is `true`.
  14661. */&lt;/span&gt;
  14662. &lt;span class=&quot;nx&quot;&gt;columns&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;arrayOf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  14663.    &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  14664.        &lt;span class=&quot;na&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;isRequired&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  14665.        &lt;span class=&quot;na&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;isRequired&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  14666.        &lt;span class=&quot;na&quot;&gt;accessor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;isRequired&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  14667.        &lt;span class=&quot;na&quot;&gt;render&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  14668.        &lt;span class=&quot;na&quot;&gt;textAlignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;oneOf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;
  14669.            &lt;span class=&quot;nx&quot;&gt;TEXT_ALIGN_LEFT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  14670.            &lt;span class=&quot;nx&quot;&gt;TEXT_ALIGN_RIGHT&lt;/span&gt;
  14671.        &lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
  14672.        &lt;span class=&quot;na&quot;&gt;widthMultiplier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  14673.        &lt;span class=&quot;na&quot;&gt;adminOnly&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;React&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;PropTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;bool&lt;/span&gt;
  14674.    &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  14675. &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14676.  
  14677. &lt;p&gt;To this:&lt;/p&gt;
  14678.  
  14679. &lt;center&gt;
  14680. &lt;img alt=&quot;Rollup Data Table prop documentation&quot; src=&quot;/images/post_images/rollup_data_table_prop_docs.png&quot; /&gt;
  14681. &lt;/center&gt;
  14682.  
  14683. &lt;p&gt;To generate the documentation, we have defined a handlebars partial for each possible &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;React.PropType&lt;/code&gt;. For example, a prop of type &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;React.PropTypes.string&lt;/code&gt; has its own handlebars partial. We can then recursively generate documentation for complicated prop types, e.g. shapes with attributes that can be arrays of strings…&lt;/p&gt;
  14684.  
  14685. &lt;p&gt;The automatically generated documentation should be updated for each component version. So, in addition to running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jest&lt;/code&gt; tests and our linter (see &lt;a href=&quot;#automated-testing&quot;&gt;Automated testing&lt;/a&gt;), our Jenkins PR Builder also checks to see if the documentation is up to date with the code in the PR. The author of a PR can update the documentation based on their changes, by running a top-level gulp task we have set up to do just that.&lt;/p&gt;
  14686.  
  14687. &lt;h4 id=&quot;component-asset-bundles&quot;&gt;Component asset bundles&lt;/h4&gt;
  14688.  
  14689. &lt;p&gt;To help with Rollup adoption and further usage, we created component asset bundles. Each component can have a variety of dependencies that are assumed to be in the global scope when using the code from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dist/&lt;/code&gt; directory. Teams may also want to use multiple components at a time.
  14690. So, we have published &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;components[.min].{js,css}&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vendor[.min].{js,css}&lt;/code&gt; asset bundles.&lt;/p&gt;
  14691.  
  14692. &lt;p&gt;The component bundles make it easy for others to use several components at a time. The vendor bundles also makes it easy to use one component, or multiple components at a time. It saves Rollup users the time of having to figure out which dependencies they need to load for the components to work, and also saves them the time of having to find publicly accessible versions of those dependencies.&lt;/p&gt;
  14693.  
  14694. &lt;h4 id=&quot;example-applications&quot;&gt;Example applications&lt;/h4&gt;
  14695.  
  14696. &lt;p&gt;As developers, sometimes code speaks for itself better than documentation can. In Rollup we have an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;examples/&lt;/code&gt; directory (not to be confused with the component-specific &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;examples/&lt;/code&gt; directories) with two example admin applications in them.&lt;/p&gt;
  14697.  
  14698. &lt;center&gt;
  14699. &lt;img alt=&quot;Rollup Data Table Admin example&quot; src=&quot;/images/post_images/rollup_admin_app.png&quot; /&gt;
  14700. &lt;/center&gt;
  14701.  
  14702. &lt;p&gt;The two examples look identical with a form and a data table, but one is built using a simple gulp build, and the other is built by loading the component and vendor bundles from the CDN. Both examples are interactable.&lt;/p&gt;
  14703.  
  14704. &lt;p&gt;We wanted to implement the same application with different build systems (or lack thereof) so that Rollup users can choose what type of build system they would like to use (or not use), and have an example to work off of for both cases. The two applications not only serve as examples, but the code can be easily copied and serve as a good starting point for a new project.&lt;/p&gt;
  14705.  
  14706. &lt;p&gt;In addition to serving as a good template for new projects or pages, both applications are meant to be educational for users of Rollup just starting to work on frontend-leaning projects. Both examples are heavily commented, and the comments are aimed at developers just beginning to learn about frontend work, as opposed to those who already know how &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gulp&lt;/code&gt; works, for example.&lt;/p&gt;
  14707.  
  14708. &lt;p&gt;The example applications serve as a nice compliment to the examples for individual components. The examples for individual components show usage of each one individually, while the full applications demonstrate how several could be included on one page.&lt;/p&gt;
  14709.  
  14710. &lt;h2 id=&quot;major-learnings&quot;&gt;Major Learnings&lt;/h2&gt;
  14711.  
  14712. &lt;p&gt;As you can tell there is a lot going on here and a lot of thought that went into the technical details of the Rollup repo. &lt;a href=&quot;/blog/frontend/2015/11/19/rollup-major-learnings.html&quot;&gt;Next week&lt;/a&gt; we’ll talk about:&lt;/p&gt;
  14713.  
  14714. &lt;ul&gt;
  14715.  &lt;li&gt;what we learned&lt;/li&gt;
  14716.  &lt;li&gt;how some of the technical pieces we talked about here fit into those learnings&lt;/li&gt;
  14717.  &lt;li&gt;how those learnings influenced some of the technical decisions&lt;/li&gt;
  14718. &lt;/ul&gt;
  14719. </description>
  14720.    </item>
  14721.    
  14722.    
  14723.    
  14724.    <item>
  14725.      <title>Rollup: Shared UI components at AdRoll</title>
  14726.      <link>https://tech.nextroll.com/blog/frontend/2015/11/05/rollup-shared-ui-components.html</link>
  14727.      <pubDate>Thu, 05 Nov 2015 00:00:00 -0800</pubDate>
  14728.      <author></author>
  14729.      <guid isPermaLink="false">https://tech.nextroll.com/blog/frontend/2015/11/05/rollup-shared-ui-components</guid>
  14730.      <description>&lt;p&gt;&lt;em&gt;This is the first in a series of three blog posts about Rollup, AdRoll’s UI component library. This post covers how and why we built a component library. See the &lt;a href=&quot;/blog/frontend/2015/11/12/rollup-react-and-npm-at-adroll.html&quot;&gt;second post&lt;/a&gt; for details on the technical implementation, and the &lt;a href=&quot;/blog/frontend/2015/11/19/rollup-major-learnings.html&quot;&gt;third post&lt;/a&gt; for a discussion of what we learned.&lt;/em&gt;&lt;/p&gt;
  14731.  
  14732. &lt;p&gt;The &lt;a href=&quot;http://blog.adroll.com/product/adroll-prospecting&quot;&gt;AdRoll Prospecting launch&lt;/a&gt; forced us to rethink the way we build UIs here at AdRoll. For the June 17th launch, we deployed two new and similar dashboards: the Prospecting dashboard and a new multi-product landing page.&lt;/p&gt;
  14733.  
  14734. &lt;h2 id=&quot;the-old-way-of-doing-things&quot;&gt;The old way of doing things&lt;/h2&gt;
  14735.  
  14736. &lt;p&gt;Until recently, the frontend code for our client-facing dashboard lived in one Git repository and was built using &lt;a href=&quot;http://backbonejs.org/&quot;&gt;Backbone.js&lt;/a&gt;. This made it easy for different projects and features to share UI components. But the downside was that as new features were added to the UI, the code became tightly coupled and the implementation of a single UI component would often end up fragmented across several files. DOM manipulations and application state updates were happening in any one of those files. It was not only hard to figure out how user interactions were implemented, but became very difficult to maintain.&lt;/p&gt;
  14737.  
  14738. &lt;h2 id=&quot;the-new-way-of-doing-things&quot;&gt;The new way of doing things&lt;/h2&gt;
  14739.  
  14740. &lt;p&gt;Having learned our lesson from the monolithic and complex codebase, the two new dashboards were being built as separate web applications in their own repositories. But because the UI wireframes for both dashboards were so similar, we also needed a scalable way to maintain the consistent look and feel across products.&lt;/p&gt;
  14741.  
  14742. &lt;center&gt;
  14743. &lt;img alt=&quot;Product Comparison&quot; src=&quot;/images/post_images/rollup_dashboard_comparison.png&quot; /&gt;
  14744. &lt;/center&gt;
  14745. &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
  14746.  
  14747. &lt;p&gt;So we knew we had to find a way for the UI components to be shared by both projects as easily as we could share code in a single repository. Sharing the code across repositories would eliminate the need for duplicate code and work for the initial implementation and when the UX specification for a component changes.&lt;/p&gt;
  14748.  
  14749. &lt;p&gt;What we ended up with is the Rollup repository (not to be confused with the &lt;a href=&quot;http://rollupjs.org&quot;&gt;other Rollup&lt;/a&gt;) - a reusable component library that now contains implementations for all major components a data-oriented UI should have. Those components include:&lt;/p&gt;
  14750.  
  14751. &lt;ul&gt;
  14752.  &lt;li&gt;Line chart&lt;/li&gt;
  14753.  &lt;li&gt;Multi Select dropdown&lt;/li&gt;
  14754.  &lt;li&gt;Date picker&lt;/li&gt;
  14755.  &lt;li&gt;Data Table&lt;/li&gt;
  14756.  &lt;li&gt;User alerts&lt;/li&gt;
  14757.  &lt;li&gt;Navbar&lt;/li&gt;
  14758.  &lt;li&gt;Data summary bar with spark lines&lt;/li&gt;
  14759.  &lt;li&gt;a shareable i18n module (localizes strings, dates, and currencies)&lt;/li&gt;
  14760. &lt;/ul&gt;
  14761.  
  14762. &lt;p&gt;Each component in Rollup is implemented in &lt;a href=&quot;https://facebook.github.io/react/&quot;&gt;React.js&lt;/a&gt; and were developed so that they could be used in a variety of frameworks and by anyone at AdRoll with a basic understanding of JavaScript. Each Rollup component also uses &lt;a href=&quot;http://semver.org/&quot;&gt;Semantic Versioning&lt;/a&gt; and is published as an &lt;a href=&quot;https://www.npmjs.com/&quot;&gt;npm&lt;/a&gt; package to a private &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npm&lt;/code&gt; registry we have set up in &lt;a href=&quot;http://www.jfrog.com/open-source/&quot;&gt;Artifactory&lt;/a&gt;.&lt;/p&gt;
  14763.  
  14764. &lt;p&gt;While the framework components are built in is not set in stone, React is our current convention. Components built in React can be self contained and have a well-defined interface. We chose React for a few reasons:&lt;/p&gt;
  14765.  
  14766. &lt;ul&gt;
  14767.  &lt;li&gt;React components do not require separate template files, thanks to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jsx&lt;/code&gt;&lt;/li&gt;
  14768.  &lt;li&gt;Each component defines its interface using &lt;a href=&quot;https://facebook.github.io/react/docs/reusable-components.html&quot;&gt;React propTypes&lt;/a&gt;&lt;/li&gt;
  14769.  &lt;li&gt;Component rendering is deterministic based on component props / state instead of direct DOM manipulation&lt;/li&gt;
  14770. &lt;/ul&gt;
  14771.  
  14772. &lt;h2 id=&quot;why-we-like-the-rollup-repository-so-much&quot;&gt;Why we like the Rollup repository so much&lt;/h2&gt;
  14773.  
  14774. &lt;p&gt;Establishing the Rollup repository for the AdRoll Prospecting launch worked just as expected, if not better. Before the launch we had a resource problem: three experienced frontend engineers had a month and a half to develop and deploy two separate dashboards. By creating the Rollup repo, we were able to fully parallelize the work required to build the two dashboards between all three engineers.&lt;/p&gt;
  14775.  
  14776. &lt;p&gt;In addition to providing an easy way to share code between projects regardless of whether they are in the same repository or not, the component library has  a few other benefits:&lt;/p&gt;
  14777.  
  14778. &lt;ul&gt;
  14779.  &lt;li&gt;Each component is developed independently from other projects, build systems, and even other components&lt;/li&gt;
  14780.  &lt;li&gt;Each component’s source and compiled code are published so that components are compatible with a variety of project setups&lt;/li&gt;
  14781.  &lt;li&gt;The Rollup repo serves as a place at AdRoll where we can safely adopt new frontend practices, tools, and frameworks&lt;/li&gt;
  14782.  &lt;li&gt;Teams responsible for building UIs can move faster thanks to shared improvements and bug fixes.&lt;/li&gt;
  14783.  &lt;li&gt;Having a unified component library makes maintaining a consistent look and feel across all products easy&lt;/li&gt;
  14784. &lt;/ul&gt;
  14785.  
  14786. &lt;h2 id=&quot;whats-next-for-rollup&quot;&gt;What’s next for Rollup&lt;/h2&gt;
  14787.  
  14788. &lt;p&gt;So, what’s next?&lt;/p&gt;
  14789.  
  14790. &lt;p&gt;Now that the implementations of many of the components have stabilized, we have a few things in mind that we’d like to do:&lt;/p&gt;
  14791.  
  14792. &lt;ul&gt;
  14793.  &lt;li&gt;Drive component adoption&lt;/li&gt;
  14794.  &lt;li&gt;Continue to improve the documentation for the repository and individual components&lt;/li&gt;
  14795.  &lt;li&gt;Encourage contributions to the Rollup repository, in the form of bug fixes or entirely new components&lt;/li&gt;
  14796.  &lt;li&gt;Continue to improve the repository tooling and share frontend knowledge with other teams (supports the bullet point above)&lt;/li&gt;
  14797.  &lt;li&gt;Open-source Rollup components. When we started work on Rollup, we took a look at open-source implementations for some of the more complicated UI components, e.g. the data table and date picker. However, we were not happy with open-source implementations we found since they did not support our use cases and were not easily extensible.&lt;/li&gt;
  14798. &lt;/ul&gt;
  14799.  
  14800. &lt;p&gt;Next week, we’ll be publishing the &lt;a href=&quot;/blog/frontend/2015/11/12/rollup-react-and-npm-at-adroll.html&quot;&gt;second post&lt;/a&gt; in this series, covering the technical details of the Rollup repository.&lt;/p&gt;
  14801. </description>
  14802.    </item>
  14803.    
  14804.    
  14805.    
  14806.    <item>
  14807.      <title>My Journey to Engineering</title>
  14808.      <link>https://tech.nextroll.com/blog/personal/2015/10/29/my-journey-to-engineering.html</link>
  14809.      <pubDate>Thu, 29 Oct 2015 00:00:00 -0700</pubDate>
  14810.      <author></author>
  14811.      <guid isPermaLink="false">https://tech.nextroll.com/blog/personal/2015/10/29/my-journey-to-engineering</guid>
  14812.      <description>&lt;h4 id=&quot;adroll--part-i&quot;&gt;AdRoll — Part I&lt;/h4&gt;
  14813. &lt;p&gt;As a high school math teacher, I never imagined that I would become a software engineer, but here I am.&lt;/p&gt;
  14814.  
  14815. &lt;p&gt;After teaching and running a math tutoring business for five years, I began to yearn for a new challenge and started to tackle &lt;a href=&quot;https://www.codecademy.com/learn/python&quot;&gt;Codecademy’s Python course&lt;/a&gt;. By the end of the section on for loops, I was hooked; the structured way of thinking that programming required jived with my math brain, and I knew that I needed to find a job that would allow me to explore this new interest. I began to apply to analyst roles, while simultaneously cramming SQL into my brain. After a couple months of working on these basics on my own, I managed to pass the AdRoll interviews and convince the CFO to let a math teacher help him with company analytics.&lt;/p&gt;
  14816.  
  14817. &lt;p&gt;In my Business Intelligence (BI) role, my appreciation for programming soared to even greater heights.  I devised scripts, in both R and Python, that ultimately allowed me to automate elements of my job, eliminating approximately 60 hours of work per month. This role afforded me the freedom to learn new technical skills that allowed me to excel in my work, while also helping me discover my passion for coding.&lt;/p&gt;
  14818.  
  14819. &lt;p&gt;Given this revelation, I decided that I wanted to transition to a full-time engineering role, but I was acutely aware that there were fundamental gaps in my knowledge base that I would need to fill before qualifying for such a transition. In order to acquire this knowledge, I knew that I would either need to engage in several months of self-study outside of work or fork over thousands of dollars to attend an accelerated engineering bootcamp.  Ultimately, I chose the riskier approach, leaving my secure job and selling my Mini Cooper to fund &lt;a href=&quot;https://hackbrightacademy.com/&quot;&gt;Hackbright&lt;/a&gt;, a 10-week software engineering fellowship.&lt;/p&gt;
  14820.  
  14821. &lt;h4 id=&quot;hackbright&quot;&gt;Hackbright&lt;/h4&gt;
  14822. &lt;p&gt;At Hackbright, I had the luxury of focusing all of my time on coding for 10 weeks, which allowed me to build a more cohesive foundation of knowledge.  The first five weeks were classroom and lab-based education focused on the fundamentals of modern day web development, and in the second five weeks, I built a full-stack web application from scratch.&lt;/p&gt;
  14823.  
  14824. &lt;p&gt;With an enriched skillset, bolstered confidence, and a web app built, I began the process of studying for technical interviews. This was an incredibly daunting task - how would I learn all of the things?!  Did I need to know pre, post, and in order traversals of binary trees? Did I need to know how to write merge and quick sort?&lt;/p&gt;
  14825.  
  14826. &lt;p&gt;I quickly came to realize that I would never learn all of the things.  It’s not possible, and frankly, it’s what I love about this field of work. There is an endless supply of things to learn, and for this reason, I’ll never get bored.  So for the purposes of studying, I just started chipping away at a list of concepts. Repetition proved to be deceivingly valuable, stumbling through a problem the first time, doing it faster the second time, and maybe even discovering a more optimal solution the third time.&lt;/p&gt;
  14827.  
  14828. &lt;p&gt;Once I had accepted that I would never know all of the things, I understood that I would also never feel “ready” to start interviewing. So I just had to start.  This was the most terrifying part of the whole journey, as I knew that I would fail some interviews, and I was very concerned that I wouldn’t get back up after a rejection.  What if I didn’t know something in an interview, knew something but completely blanked, thought of the more optimal solution as soon as I left the building?  All of these things happened, but shockingly, it didn’t hurt that badly when I fell, and I did get back up because I was hungry and determined.&lt;/p&gt;
  14829.  
  14830. &lt;h4 id=&quot;adroll--part-ii&quot;&gt;AdRoll — Part II&lt;/h4&gt;
  14831. &lt;p&gt;After interviewing and receiving a couple of offers, I decided to return to AdRoll’s engineering team, most significantly because of the supportive culture here. Throughout my time on BI, my hiatus at Hackbright, and my interview process, there was always a team of my peers and superiors who were cheering for me. Knowing that I would have that encouragement as I embarked on this career transition was truly pivotal in my decision.&lt;/p&gt;
  14832.  
  14833. &lt;p&gt;In addition to the aid of the supportive environment in which I have the privilege of working everyday, I have also personally taken measures to ensure my smooth transition to engineering. Firstly, I quickly constructed a support team that I could go to with a wide array of questions - everything from setting up my dev environment to codebase specific concerns.  I took a “spread the love” approach with these folks and would rotate through them. In this way, I was able not only to be proactive about my learning, but also to establish relationships and allies within engineering immediately.&lt;/p&gt;
  14834.  
  14835. &lt;p&gt;Additionally, without the warm, fuzzy blanket of Hackbright wrapped around me, I quickly learned to be my own cheerleader and celebrate my victories, even the tiny ones.  I chose this path because it is hard, and the challenge is part of the thrill, so it is hugely important that I allow myself some (smug) satisfaction when I finally get that unit test to pass.  Those are the glorious moments that make this type of work so rewarding.&lt;/p&gt;
  14836.  
  14837. &lt;p&gt;Lastly, I had to fight off the self-constructed notion that I was an impostor or a fraud and convince myself that I could contribute right away.  Whether it’s teaching someone a new built-in Python method or my fresh perspective on a piece of code that uncovers a gap in test coverage, I have been able to add value, and I will just continue to add more.&lt;/p&gt;
  14838.  
  14839. &lt;p&gt;In closing, I can honestly say it has all been worth it.  The long hours, the leaps of faith, and the various unknowns have all led me to what I have been searching for: unending intellectual stimulation (with a paycheck). I am eternally grateful to those who first gave me a chance at AdRoll and to those who have continued to support me throughout my tenure here.&lt;/p&gt;
  14840. </description>
  14841.    </item>
  14842.    
  14843.    
  14844.    
  14845.    <item>
  14846.      <title>Managing Containerized Data Pipeline Dependencies With Luigi</title>
  14847.      <link>https://tech.nextroll.com/blog/data/2015/10/15/luigi.html</link>
  14848.      <pubDate>Thu, 15 Oct 2015 00:00:00 -0700</pubDate>
  14849.      <author></author>
  14850.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2015/10/15/luigi</guid>
  14851.      <description>&lt;p&gt;This is the &lt;a href=&quot;http://tech.adroll.com/blog/data/2015/09/22/data-pipelines-docker.html&quot; title=&quot;Docker @AdRoll&quot;&gt;second article&lt;/a&gt; in a series that describes how we built AdRoll Prospecting. This time, we’ll talk about managing batch job dependencies.&lt;/p&gt;
  14852.  
  14853. &lt;p&gt;Batch processing and data pipeline coordination is one of the oldest application of computer systems. In late 1950s US Navy and NASA first started using computers for task scheduling—tasks being assembly jobs performed by people. Controlling people that way is what &lt;a href=&quot;https://en.wikipedia.org/wiki/Topological_sorting&quot; title=&quot;Topological sorting&quot;&gt;topological sorting algorithms&lt;/a&gt; were invented for. Later, in mid-1960s, one of the killer features of the IBM’s novel operating system OS/360 was its extensive mainframe task management capabilities.&lt;/p&gt;
  14854.  
  14855. &lt;p&gt;&lt;img src=&quot;/images/post_images/luigi_RTOS360.png&quot; alt=&quot;RTOS/360&quot; title=&quot;RTOS/360&quot; /&gt;.&lt;/p&gt;
  14856.  
  14857. &lt;p&gt;Back then, system level limitations, like memory and disk space, were major concerns. Therefore, scheduling required a lot of manual hints and tweaks.Nowadays—well, maybe over the last 40 years—this became less of an issue. One can easily buy hundreds of gigabytes and terabytes of RAM and disk space, enough to store, say, bios and photos of every person on Earth for the price of a family car.&lt;/p&gt;
  14858.  
  14859. &lt;p&gt;Building effective and robust data processing pipelines became much easier compared to 1960s; there are quite a few tools available to do this today. To name some, since it is hard to keep a comprehensive list:&lt;/p&gt;
  14860.  
  14861. &lt;ul&gt;
  14862.  &lt;li&gt;&lt;a href=&quot;https://github.com/mesos/chronos&quot; title=&quot;Chronos&quot;&gt;Chronos&lt;/a&gt; from AirBnB&lt;/li&gt;
  14863.  &lt;li&gt;&lt;a href=&quot;https://github.com/airbnb/airflow&quot; title=&quot;Airflow&quot;&gt;Airflow&lt;/a&gt; from AirBnB&lt;/li&gt;
  14864.  &lt;li&gt;&lt;a href=&quot;https://github.com/spotify/luigi&quot; title=&quot;Luigi&quot;&gt;Luigi&lt;/a&gt; from Spotify&lt;/li&gt;
  14865.  &lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/datapipeline/&quot; title=&quot;AWS Data Pipeline&quot;&gt;AWS Data Pipeline&lt;/a&gt;&lt;/li&gt;
  14866. &lt;/ul&gt;
  14867.  
  14868. &lt;p&gt;Many of them can fit the bill for a lot of cases, but still, there are some trivial things that in our experience are easy to overlook and get oneself in trouble. Specifically:&lt;/p&gt;
  14869.  
  14870. &lt;h4 id=&quot;problem-with-time-based-scheduling&quot;&gt;Problem with time based scheduling&lt;/h4&gt;
  14871.  
  14872. &lt;p&gt;Time-based scheduling is something everyone starts with. For example, analysts want a daily revenue report to be generated at 9AM daily. We can just use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cron(8)&lt;/code&gt; or try something fancier like Chronos for this. And that works fine for simple cases.&lt;/p&gt;
  14873.  
  14874. &lt;p&gt;&lt;img src=&quot;/images/post_images/luigi_timebased_1.png&quot; alt=&quot;timebased_1&quot; title=&quot;Scheduling 1&quot; /&gt;&lt;/p&gt;
  14875.  
  14876. &lt;h4 id=&quot;next-stop-dependencies&quot;&gt;Next stop: dependencies&lt;/h4&gt;
  14877.  
  14878. &lt;p&gt;Not too long after setting up &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cron&lt;/code&gt;, it turns out that we need our task to run after a daily database ETL job runs. A simple solution comes to mind: just run ETL job at 6AM, then trigger daily revenue report generation job as ETL job ends.&lt;/p&gt;
  14879.  
  14880. &lt;p&gt;&lt;img src=&quot;/images/post_images/luigi_timebased_2.png&quot; alt=&quot;timebased_2&quot; title=&quot;Scheduling 2&quot; /&gt;&lt;/p&gt;
  14881.  
  14882. &lt;p&gt;And a bit later, but sooner than one may think, our pipeline will need to support multiple dependencies. Now these are not that easy to express in a time-based system: one cannot unconditionally trigger jobs downstream anymore, as multiple prerequisites must be met before they run.&lt;/p&gt;
  14883.  
  14884. &lt;p&gt;&lt;img src=&quot;/images/post_images/luigi_timebased_3.png&quot; alt=&quot;timebased_3&quot; title=&quot;Scheduling 3&quot; /&gt;&lt;/p&gt;
  14885.  
  14886. &lt;h4 id=&quot;failures-and-reruns&quot;&gt;Failures and reruns&lt;/h4&gt;
  14887.  
  14888. &lt;p&gt;It wouldn’t be that hard to work around these multiple dependencies if not for the fact that jobs fail. Or even worse, they don’t fail, but instead produce incorrect results due to a software bug somewhere upstream.&lt;/p&gt;
  14889.  
  14890. &lt;p&gt;&lt;img src=&quot;/images/post_images/luigi_timebased_4.png&quot; alt=&quot;timebased_4&quot; title=&quot;Scheduling 4&quot; /&gt;&lt;/p&gt;
  14891.  
  14892. &lt;p&gt;And with our fancy time-based scheduling, things get complicated: what if we need rerun yesterday’s report? Or rather, if yesterday’s report failed, if we rerun it will we get today’s or yesterday’s one? Should we backfill failed jobs or just abandon them and go on? What if that happens across midnight? Will it trigger “today’s” jobs downstream or “yesterday’s”? What if the pipeline starts to take too long? Can we restart or disable part of it?&lt;/p&gt;
  14893.  
  14894. &lt;p&gt;Correct answer is to sit back and maybe reflect a little. Job pipelines are all about inputs, outputs, and dependencies, but not about time—that can easily be made another virtual dependency expressed as “wallclock time &amp;gt; X.”&lt;/p&gt;
  14895.  
  14896. &lt;p&gt;&lt;img src=&quot;/images/post_images/luigi_timebased_5.png&quot; alt=&quot;timebased_5&quot; title=&quot;Scheduling 5&quot; /&gt;&lt;/p&gt;
  14897.  
  14898. &lt;h4 id=&quot;make1&quot;&gt;make(1)&lt;/h4&gt;
  14899.  
  14900. &lt;p&gt;Some 40 years ago, someone already solved this problem in UNIX. If you define your pipelines as a directed acyclic graph of dependencies, they become trivial to reason about. Simply follow a few rules:&lt;/p&gt;
  14901.  
  14902. &lt;ul&gt;
  14903.  &lt;li&gt;
  14904.    &lt;p&gt;Jobs need to be re-runnable. I would like to say &lt;a href=&quot;https://en.wikipedia.org/wiki/Side_effect_(computer_science)&quot; title=&quot;Idempotence&quot;&gt;idempotent&lt;/a&gt; but that’s not always possible in a strict sense. Ideally, they should be pure functions of their inputs. And these inputs should be files.&lt;/p&gt;
  14905.  &lt;/li&gt;
  14906.  &lt;li&gt;
  14907.    &lt;p&gt;If they do interact with the outside world (call external APIs, and such), try to turn that into a materialized snapshot stored somewhere persistently as a file. That way, job failures are easier to reproduce.&lt;/p&gt;
  14908.  &lt;/li&gt;
  14909.  &lt;li&gt;
  14910.    &lt;p&gt;Jobs should be transactional, in the sense that they either completely fail or completely succeed.&lt;/p&gt;
  14911.  &lt;/li&gt;
  14912.  &lt;li&gt;
  14913.    &lt;p&gt;Yesterday’s report generation job and today’s should be separate jobs, meaning they should not output to exactly the same place. Add timestamp or some kind of job id there, treating &lt;a href=&quot;https://aws.amazon.com/s3/&quot; title=&quot;Amazon Simple Storage Service&quot;&gt;S3&lt;/a&gt; as write only storage. One could draw some parallels with &lt;a href=&quot;https://en.wikipedia.org/wiki/Static_single_assignment_form&quot; title=&quot;Static single assignment form&quot;&gt;SSA&lt;/a&gt; here—this makes it much easier to reason about what’s going on in the code.&lt;/p&gt;
  14914.  &lt;/li&gt;
  14915. &lt;/ul&gt;
  14916.  
  14917. &lt;p&gt;Following these principles makes it much simpler than your average makefile, assuming you have enough disk space (which, as we know, is cheap) to write all outputs.&lt;/p&gt;
  14918.  
  14919. &lt;h3 id=&quot;luigi&quot;&gt;Luigi&lt;/h3&gt;
  14920.  
  14921. &lt;p&gt;That’s why we now use Spotify’s &lt;a href=&quot;https://github.com/spotify/luigi&quot; title=&quot;Luigi&quot;&gt;Luigi&lt;/a&gt;. It is simple enough, extensible in Python, and doesn’t have weird implicit rules like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;make(1)&lt;/code&gt;—make is a build system, after all, and has its own quirks. All inputs and outputs are in &lt;a href=&quot;https://aws.amazon.com/s3/&quot; title=&quot;Amazon Simple Storage Service&quot;&gt;S3&lt;/a&gt;, which provides great persistence guarantees, unlimited storage space, and read bandwidth that scales more or less linearly with the number of readers.&lt;/p&gt;
  14922.  
  14923. &lt;h4 id=&quot;docker&quot;&gt;Docker&lt;/h4&gt;
  14924.  
  14925. &lt;p&gt;We also wanted to give users the ability to use tools they want for data processing: R comes to mind first; C, which we use extensively; and Lua and Haskell make appearances as well. This is where &lt;a href=&quot;https://www.docker.com/&quot; title=&quot;Docker&quot;&gt;Docker&lt;/a&gt; comes handy: we use &lt;a href=&quot;https://github.com/spotify/luigi&quot; title=&quot;Luigi&quot;&gt;Luigi&lt;/a&gt; to manage dependencies and schedule containerized jobs, but normally, no data processing happens in the Luigi worker process itself. See our previous &lt;a href=&quot;http://tech.adroll.com/blog/data/2015/09/22/data-pipelines-docker.html&quot; title=&quot;Docker @AdRoll&quot;&gt;blog post&lt;/a&gt; for more details.&lt;/p&gt;
  14926.  
  14927. &lt;p&gt;Here’s a typical Luigi job:&lt;/p&gt;
  14928.  
  14929. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PivotRunner&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Task&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  14930.    &lt;span class=&quot;n&quot;&gt;blob_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  14931.    &lt;span class=&quot;n&quot;&gt;out_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  14932.    &lt;span class=&quot;n&quot;&gt;segments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Parameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  14933.  
  14934.    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;requires&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  14935.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BlobTask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;blob_path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;blob_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  14936.  
  14937.    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  14938.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;luigi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;S3Target&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  14939.  
  14940.    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  14941.        &lt;span class=&quot;n&quot;&gt;q&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  14942.            &lt;span class=&quot;s&quot;&gt;&quot;cmdline&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;pivot %s %s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;segments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;
  14943.            &lt;span class=&quot;s&quot;&gt;&quot;image&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;docker:5000/pivot:latest&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  14944.            &lt;span class=&quot;s&quot;&gt;&quot;caps&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;type=r3.4xlarge&quot;&lt;/span&gt;
  14945.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  14946.        &lt;span class=&quot;n&quot;&gt;quentin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_queries&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;pivot&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dumps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;max_retries&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  14947.  
  14948. &lt;p&gt;As one can see, it is conceptually very similar to a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Makefile&lt;/code&gt;, but with Python syntax.&lt;/p&gt;
  14949.  
  14950. &lt;p&gt;The only missing piece is a system that would run containers on a cluster of &lt;a href=&quot;https://aws.amazon.com/ec2/&quot; title=&quot;Amazon EC2&quot;&gt;EC2&lt;/a&gt; instances, so that one doesn’t have to care about provisioning and scaling the cluster in the Luigi job. We built an in-house solution called Quentin; it is a simple job queue with a REST API and UI on top.&lt;/p&gt;
  14951.  
  14952. &lt;p&gt;&lt;img src=&quot;/images/post_images/luigi_quentin.png&quot; alt=&quot;Quentin&quot; title=&quot;Quentin&quot; /&gt;.&lt;/p&gt;
  14953.  
  14954. &lt;p&gt;Quentin maintains the task queue and feeds metrics to Cloudwatch so AWS Autoscaling can take care of scaling clusters up and down.&lt;/p&gt;
  14955.  
  14956. &lt;h4 id=&quot;nice-to-have-dynamic-dependencies&quot;&gt;Nice to have: dynamic dependencies&lt;/h4&gt;
  14957.  
  14958. &lt;p&gt;One feature of Luigi that we found very useful is the ability to create nodes in acyclic job graphs dynamically. This makes it possible to shard jobs dynamically based on volume, use Hadoop-like map reduce patterns, and even recursive reduce schemes, with a few lines of code. We use this heavily so our typical pipeline consists of thousands of containerized jobs orchestrated by Luigi, created dynamically by a higher level, “virtual” job.&lt;/p&gt;
  14959.  
  14960. &lt;p&gt;&lt;img src=&quot;/images/post_images/luigi_luigi.png&quot; alt=&quot;Luigi&quot; title=&quot;Luigi&quot; /&gt;.&lt;/p&gt;
  14961.  
  14962. &lt;h4 id=&quot;not-necessary-high-availability&quot;&gt;Not necessary: high availability&lt;/h4&gt;
  14963.  
  14964. &lt;p&gt;Following the practices above, making jobs (mostly) idempotent, and storing all inputs and outputs in a resilient store like &lt;a href=&quot;https://aws.amazon.com/s3/&quot; title=&quot;Amazon Simple Storage Service&quot;&gt;S3&lt;/a&gt;, we find that it is not necessary for components like the Luigi server or queue service (Quentin) to be highly available. Distributed systems are &lt;a href=&quot;https://en.wikipedia.org/wiki/Byzantine_fault_tolerance&quot; title=&quot;Byzantine Failures&quot;&gt;hard&lt;/a&gt;, and, in this case it costs us much more in development and operational efforts to ensure zero downtime survival during server and data center failures.&lt;/p&gt;
  14965.  
  14966. &lt;p&gt;Currently, if anything fails, a new Luigi server comes up and continues from where it stopped, based on results that are already in S3. This works perfectly as long as the data pipeline is not very sensitive to small delays.&lt;/p&gt;
  14967.  
  14968. &lt;h3 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h3&gt;
  14969.  
  14970. &lt;p&gt;There are certainly things to improve in this ecosystem (like our friends at &lt;a href=&quot;http://pachyderm.io&quot; title=&quot;Pachyderm&quot;&gt;Pachyderm&lt;/a&gt; are trying to do), but even with the existing tools, it really pays off when you keep things simple: S3 as a data store, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;make&lt;/code&gt;-like dependency scheduler on top, simple non-HA job queue, and Docker as a working horse works. This combination works very well for us in practice.&lt;/p&gt;
  14971.  
  14972. &lt;p&gt;If you’d like more details on our experience, here are &lt;a href=&quot;http://www.slideshare.net/AmazonWebServices/cmp310-data-processing-pipelines-using-containers-spot-instances&quot; title=&quot;slides&quot;&gt;slides&lt;/a&gt; and &lt;a href=&quot;https://www.youtube.com/watch?v=bNkfdU_LDHg&quot; title=&quot;video&quot;&gt;video&lt;/a&gt; of the talk on our data pipeline at &lt;a href=&quot;https://reinvent.awsevents.com/&quot; title=&quot;AWS re:Invent 2015&quot;&gt;re:Invent 2015&lt;/a&gt;.&lt;/p&gt;
  14973.  
  14974. &lt;iframe src=&quot;//www.slideshare.net/slideshow/embed_code/key/JTiSYzYQ76qa93&quot; width=&quot;425&quot; height=&quot;355&quot; frameborder=&quot;0&quot; marginwidth=&quot;0&quot; marginheight=&quot;0&quot; scrolling=&quot;no&quot; style=&quot;border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;&quot; allowfullscreen=&quot;&quot;&gt; &lt;/iframe&gt;
  14975.  
  14976. &lt;p&gt;Video:&lt;/p&gt;
  14977.  
  14978. &lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/bNkfdU_LDHg&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;
  14979.  
  14980. </description>
  14981.    </item>
  14982.    
  14983.    
  14984.    
  14985.    <item>
  14986.      <title>Petabyte-Scale Data Pipelines with Docker, Luigi and Elastic Spot Instances</title>
  14987.      <link>https://tech.nextroll.com/blog/data/2015/09/22/data-pipelines-docker.html</link>
  14988.      <pubDate>Tue, 22 Sep 2015 00:00:00 -0700</pubDate>
  14989.      <author></author>
  14990.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2015/09/22/data-pipelines-docker</guid>
  14991.      <description>&lt;p&gt;This is the first article in a series that describes how we
  14992. built a new data-intensive product, &lt;a href=&quot;http://blog.adroll.com/product/adroll-prospecting&quot; title=&quot;AdRoll Prospecting&quot;&gt;AdRoll Prospecting&lt;/a&gt;, using
  14993. an architecture based on Docker containers.&lt;/p&gt;
  14994.  
  14995. &lt;p&gt;We &lt;a href=&quot;http://www.meetup.com/Bay-Area-Big-Data-with-Docker/&quot; title=&quot;Big Data with Docker meetup&quot;&gt;organized a meetup&lt;/a&gt; about this topic with &lt;a href=&quot;http://pachyderm.io&quot; title=&quot;Pachyderm&quot;&gt;Pachyderm.io&lt;/a&gt;.
  14996. You can watch a recording of our meetup talk here:&lt;/p&gt;
  14997.  
  14998. &lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/4Nv_DXMNoN0&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;
  14999.  
  15000. &lt;p&gt;and see the slides here:&lt;/p&gt;
  15001.  
  15002. &lt;iframe src=&quot;//slides.com/villetuulos/pb-scale-workflows/embed&quot; width=&quot;576&quot; height=&quot;420&quot; scrolling=&quot;no&quot; frameborder=&quot;0&quot; webkitallowfullscreen=&quot;&quot; mozallowfullscreen=&quot;&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;
  15003.  
  15004. &lt;p&gt;We will elaborate different aspects of the architecture in upcoming blog
  15005. posts. The one below will focus on Docker.&lt;/p&gt;
  15006.  
  15007. &lt;h2 id=&quot;a-modern-data-driven-product&quot;&gt;A Modern Data Driven Product&lt;/h2&gt;
  15008.  
  15009. &lt;p&gt;On June 17th, we launched a new product, &lt;a href=&quot;http://blog.adroll.com/product/adroll-prospecting&quot; title=&quot;AdRoll Prospecting&quot;&gt;AdRoll Prospecting&lt;/a&gt;,
  15010. to public beta. A remarkable thing about the launch is that the product
  15011. was built from scratch by a small engineering team of six people, in
  15012. about six months, and it was released on time.&lt;/p&gt;
  15013.  
  15014. &lt;p&gt;The product does something that is practically the Holy Grail of
  15015. marketing: The core of AdRoll Prospecting is &lt;a href=&quot;http://blog.adroll.com/trends/predictive-behavioral-targeting&quot; title=&quot;Lookalike Model&quot;&gt;a massive-scale machine
  15016. learning model&lt;/a&gt; that is able to predict who is most likely
  15017. interested in your product, amongst billions of cookies AdRoll knows
  15018. something about, and thus can find new customers for your business.&lt;/p&gt;
  15019.  
  15020. &lt;p&gt;A modern, data-driven product like Prospecting is not only about
  15021. machine learning. We provide also an easy-to-use dashboard (built
  15022. using &lt;a href=&quot;https://facebook.github.io/react/&quot; title=&quot;React.js&quot;&gt;React.js&lt;/a&gt;) that allows you to view performance of your
  15023. prospecting campaigns in detail. Behind the scenes, we are connected to
  15024. AdRoll’s &lt;a href=&quot;http://blog.adroll.com/news/valentino-volonghi-aws-summit-2015&quot; title=&quot;RTB&quot;&gt;Real-Time Bidding engine&lt;/a&gt;, and we have dozens of checks
  15025. and dashboards monitoring the health of the product internally, so we
  15026. can be proactive about any issues affecting customer accounts.&lt;/p&gt;
  15027.  
  15028. &lt;p&gt;Thanks to our experience with AdRoll’s existing &lt;a href=&quot;https://www.adroll.com/product/web-retargeting&quot; title=&quot;AdRoll Retargeting&quot;&gt;retargeting
  15029. product&lt;/a&gt;, we had an idea of what building a complex system like
  15030. this would entail. When we started building AdRoll Prospecting, we
  15031. were able to look back at the lessons learned, and think how we could
  15032. build a flexible and sustainable backend architecture for a massively
  15033. data-driven product like this &lt;a href=&quot;http://firstround.com/review/speed-as-a-habit/&quot; title=&quot;Speed As A Habit&quot;&gt;as quickly as possible&lt;/a&gt; without
  15034. sacrificing robustness or cost of operation.&lt;/p&gt;
  15035.  
  15036. &lt;h2 id=&quot;managing-complexity&quot;&gt;Managing Complexity&lt;/h2&gt;
  15037.  
  15038. &lt;p&gt;We have been very happy with the result, which is the motivation for
  15039. this series of blog articles. Not only has it allowed us to build and
  15040. release the product on time, but we are planning to migrate many existing
  15041. workloads to the new system as well.&lt;/p&gt;
  15042.  
  15043. &lt;p&gt;Probably the most important feature of the new architecture is its lack
  15044. of features, i.e. simplicity. Knowing that the problem we are
  15045. solving is so complex, we did not want to complicate it further with
  15046. a framework that would force us to model the problem &lt;a href=&quot;http://martinfowler.com/bliki/InversionOfControl.html&quot; title=&quot;Frameworks&quot;&gt;in terms of a
  15047. framework&lt;/a&gt;.&lt;/p&gt;
  15048.  
  15049. &lt;p&gt;Our architecture is based on a stack of three complementary layers, which
  15050. heavily rely on well-known, battle-hardened components:&lt;/p&gt;
  15051.  
  15052. &lt;ol&gt;
  15053.  &lt;li&gt;
  15054.    &lt;p&gt;At the lowest level, we use &lt;a href=&quot;https://aws.amazon.com/ec2/spot/&quot; title=&quot;Spot Instances&quot;&gt;AWS Spot Instances&lt;/a&gt; and
  15055. &lt;a href=&quot;https://aws.amazon.com/autoscaling/&quot; title=&quot;AutoScaling&quot;&gt;Auto-Scaling Groups&lt;/a&gt; to provide computing resources on a demand
  15056. basis. Data is stored in &lt;a href=&quot;https://aws.amazon.com/s3/&quot; title=&quot;Amazon Simple Storage Service&quot;&gt;AWS Simple Storage Service (S3)&lt;/a&gt;. We have
  15057. built a simple in-house job queue, Quentin, so we can leverage &lt;a href=&quot;http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/policy_creating.html&quot; title=&quot;CloudWatch&quot;&gt;custom
  15058. CloudWatch metrics&lt;/a&gt; to trigger scaling based on the actual
  15059. length of the job queue.&lt;/p&gt;
  15060.  &lt;/li&gt;
  15061.  &lt;li&gt;
  15062.    &lt;p&gt;We orchestrate a complex graph of interdependent batch jobs using
  15063. &lt;a href=&quot;https://github.com/spotify/luigi&quot; title=&quot;Luigi&quot;&gt;Luigi&lt;/a&gt;, a Python-based, open-source tool for workflow management.&lt;/p&gt;
  15064.  &lt;/li&gt;
  15065.  &lt;li&gt;
  15066.    &lt;p&gt;At the highest level, each individual task (batch job) is packaged as
  15067. a &lt;a href=&quot;https://www.docker.com/&quot; title=&quot;Docker&quot;&gt;Docker container&lt;/a&gt;.&lt;/p&gt;
  15068.  &lt;/li&gt;
  15069. &lt;/ol&gt;
  15070.  
  15071. &lt;p&gt;This stack allows anyone to build new tasks very quickly using Docker,
  15072. define their dependencies in terms of inputs and outputs using Luigi,
  15073. and get them executed on any number of EC2 instances without having to
  15074. worry about provisioning thanks to our scheduler and auto-scaling groups,
  15075. as illustrated below.&lt;/p&gt;
  15076.  
  15077. &lt;p&gt;&lt;img src=&quot;/images/post_images/docker-luigi-arch.png&quot; alt=&quot;Docker, Luigi and Quentin&quot; title=&quot;Architecture Diagram&quot; /&gt;.&lt;/p&gt;
  15078.  
  15079. &lt;p&gt;This seemingly simple architecture makes a vast amount of complexity
  15080. manageable. The Docker containers encapsulate jobs written in seven
  15081. different programming languages. Luigi is used to orchestrate a tightly
  15082. connected graph of about 50 of these jobs, and Quentin and Auto-Scaling
  15083. Groups allow us to execute the jobs on an elastic fleet of hundreds of
  15084. the largest EC2 spot instances in a very cost-effective manner.&lt;/p&gt;
  15085.  
  15086. &lt;p&gt;The main benefit of embracing this heterogeneous, &lt;a href=&quot;https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar&quot; title=&quot;The Cathedral and The Bazaar&quot;&gt;bazaar-like
  15087. approach&lt;/a&gt; is that we can safely use the most suitable language,
  15088. instance type, and distribution pattern for each task.&lt;/p&gt;
  15089.  
  15090. &lt;h2 id=&quot;old-new-paradigm&quot;&gt;Old New Paradigm&lt;/h2&gt;
  15091.  
  15092. &lt;p&gt;Containerized batch jobs have been used for decades. Mainframes have
  15093. pioneered batch jobs and virtualization since the 1960s. Outside
  15094. mainframes, for example even by the the early 2000s, Google was
  15095. isolating batch workloads using operating-system level virtualization
  15096. using their in-house system, &lt;a href=&quot;http://blog.kubernetes.io/2015/04/borg-predecessor-to-kubernetes.html&quot; title=&quot;Google Kubernetes&quot;&gt;Borg&lt;/a&gt;. A few years after, this
  15097. approach became more widely available using open-source tools such as
  15098. &lt;a href=&quot;http://openvz.org&quot; title=&quot;OpenVZ&quot;&gt;OpenVZ&lt;/a&gt; and &lt;a href=&quot;https://en.wikipedia.org/wiki/LXC&quot; title=&quot;Linux Containers&quot;&gt;LXC&lt;/a&gt;, and later with managed services such as
  15099. &lt;a href=&quot;https://www.joyent.com/blog/hello-manta-bringing-unix-to-big-data&quot; title=&quot;Joyent Manta&quot;&gt;Joyent Manta&lt;/a&gt; that is based on &lt;a href=&quot;http://en.wikipedia.org/wiki/Solaris_Containers&quot; title=&quot;Solaris Containers&quot;&gt;Solaris Zones&lt;/a&gt;.&lt;/p&gt;
  15100.  
  15101. &lt;p&gt;Containers solve three tricky issues in batch processing, namely:&lt;/p&gt;
  15102.  
  15103. &lt;ol&gt;
  15104.  &lt;li&gt;
  15105.    &lt;p&gt;&lt;em&gt;Job packaging&lt;/em&gt; - a job may depend on a multitude of third party
  15106. libraries that have dependencies of their own. In particular, if the job
  15107. is written in a scripting language, such as Python or R, encapsulating
  15108. the whole environment in a self-contained package is non-trivial.&lt;/p&gt;
  15109.  &lt;/li&gt;
  15110.  &lt;li&gt;
  15111.    &lt;p&gt;&lt;em&gt;Job deployment&lt;/em&gt; - packaged jobs need to be deployable on a host
  15112. machine, and they need to be able to execute on the host without
  15113. altering system-wide resources.&lt;/p&gt;
  15114.  &lt;/li&gt;
  15115.  &lt;li&gt;
  15116.    &lt;p&gt;&lt;em&gt;Resource isolation&lt;/em&gt; - if multiple jobs execute on the same host
  15117. simultaneously, they must share resources nicely and they must not
  15118. interfere with each other.&lt;/p&gt;
  15119.  &lt;/li&gt;
  15120. &lt;/ol&gt;
  15121.  
  15122. &lt;p&gt;All these issues have been technically solvable using existing
  15123. virtualization techniques for decades prior to Docker. What happened
  15124. with Docker is that creating containers became so easy, and socially
  15125. acceptable, that today it is realistic to expect that every analyst,
  15126. data scientist, and junior software engineer is able to package their
  15127. code in a container on their laptop.&lt;/p&gt;
  15128.  
  15129. &lt;p&gt;A result of this is that we can allow, and even encourage, each user
  15130. of the system to use their favorite, most appropriate tools for the
  15131. job, instead of learning a new language or computing paradigms such
  15132. as MapReduce, whatever makes them most productive. Each person is
  15133. naturally responsible for packaging their jobs using Docker, and fixing
  15134. them if anything fails in the container.&lt;/p&gt;
  15135.  
  15136. &lt;p&gt;The result is not only faster time to market, thanks to a more efficient
  15137. use of different skillsets and battle-hardened tools such as R, but also
  15138. a feeling of empowerment across the organization; everyone can access
  15139. data, test new models, and push code to production using the tools they
  15140. know best.&lt;/p&gt;
  15141.  
  15142. &lt;h2 id=&quot;good-behavior-expected&quot;&gt;Good Behavior Expected&lt;/h2&gt;
  15143.  
  15144. &lt;p&gt;Containerized batch jobs are not only about peace, love, and continuous
  15145. deployment. We expect jobs to adhere to certain ground rules.&lt;/p&gt;
  15146.  
  15147. &lt;p&gt;The basic pattern that most of our jobs follow is that they &lt;strong&gt;only&lt;/strong&gt;
  15148. ingest immutable data from S3 as their input, and produce immutable
  15149. data in S3 as their output. If the job sticks with this simple
  15150. pattern, it becomes &lt;a href=&quot;https://en.wikipedia.org/wiki/Side_effect_(computer_science)&quot; title=&quot;Idempotence&quot;&gt;idempotent and side-effect free&lt;/a&gt;.&lt;/p&gt;
  15151.  
  15152. &lt;p&gt;In effect, each container becomes a function in the sense of
  15153. functional programming. We have found that it is very natural to write
  15154. containerized batch jobs with this mindset, from the simplest shell
  15155. scripts to the most complex data manipulation jobs.&lt;/p&gt;
  15156.  
  15157. &lt;p&gt;Another related requirement is that jobs should be atomic. We expect
  15158. that jobs produce a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_SUCCESS&lt;/code&gt; file, similar to Hadoop, upon successful
  15159. completion. Operations on single files are atomic in S3, so this
  15160. requirement is easy to fulfill. Our task dependencies are set up with
  15161. Luigi so that the output data is considered valid only if the success
  15162. file exists, so partial results are not a concern.&lt;/p&gt;
  15163.  
  15164. &lt;p&gt;We have found this straightforward reliance on files in S3 to be easy
  15165. to explain, troubleshoot, and reason about. S3 is a nearly perfect data
  15166. fabric: it is extremely scalable, has an amazing record of uptime,
  15167. and it is cheap to use. The empowering effect of Docker would be much
  15168. diminished if data were less easily accessible.&lt;/p&gt;
  15169.  
  15170. &lt;h2 id=&quot;next-up-luigi&quot;&gt;Next Up: Luigi&lt;/h2&gt;
  15171.  
  15172. &lt;p&gt;Containerized batch jobs benefit from a clear &lt;a href=&quot;https://en.wikipedia.org/wiki/Separation_of_concerns&quot; title=&quot;Separation of Concerns&quot;&gt;separation of
  15173. concerns&lt;/a&gt;. Not only is it easier to write jobs this way, but
  15174. it also helps to ensure that each job takes only minutes, or at most a
  15175. few hours, to execute, which is crucial when dealing with ephemeral spot
  15176. instances.&lt;/p&gt;
  15177.  
  15178. &lt;p&gt;An inevitable result of this is that the system becomes a complex
  15179. hairball of interdependent jobs. &lt;a href=&quot;https://github.com/spotify/luigi&quot; title=&quot;Luigi&quot;&gt;Luigi&lt;/a&gt; has proven to be a
  15180. direct way to manage this dependency graph, which is the topic of
  15181. a future blog post.&lt;/p&gt;
  15182.  
  15183. </description>
  15184.    </item>
  15185.    
  15186.    
  15187.    
  15188.    <item>
  15189.      <title>Factorization Machines</title>
  15190.      <link>https://tech.nextroll.com/blog/data-science/2015/08/25/factorization-machines.html</link>
  15191.      <pubDate>Tue, 25 Aug 2015 00:00:00 -0700</pubDate>
  15192.      <author></author>
  15193.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data-science/2015/08/25/factorization-machines</guid>
  15194.      <description>&lt;p&gt;The data science team at AdRoll is constantly working to improve our programmatic bidding algorithm, BidIQ. One
  15195. recent improvement to BidIQ has been the introduction of a novel modeling technique called Factorization Machines
  15196. (FM). The FM model allows us to consider interactions between all predictive variables, rather than only certain
  15197. manually selected interactions. Thanks to the FM model, we are able to use the exact same feature information to
  15198. more accurately value every potential impression that comes our way. As a result, our advertisers are collecting almost
  15199. 7% more clicks for only 2% more cost.&lt;/p&gt;
  15200.  
  15201. &lt;p&gt;To motivate the FM model, let’s consider a simplified ad bidding algorithm that bids using three variables:
  15202. web domain, advertiser, and user location. In the first iteration of our modeling, we would learn a weight \({w_i}\)
  15203. for each web domain, advertiser, and user location. Our bid, \(B_{linear}\), would then be based on a function
  15204. \(f\) of a linear combination of the weight vector \(\vec{w}\). That is:&lt;/p&gt;
  15205.  
  15206. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  15207. %&lt;![CDATA[
  15208. \begin{equation}
  15209. B_{linear} = f\left(w_0 + \sum_{i=1}^n w_i x_i\right)
  15210. \end{equation}
  15211. %]]&gt;
  15212. &lt;/script&gt;
  15213.  
  15214. &lt;p&gt;where \(w_0\) is an intercept term and \(x_i\) is the value of the \(i\)th feature. This modeling technique
  15215. proved extremely powerful, but the drawback was that the model only learned the effect of our three variables individually
  15216. rather than in combination. But what if an advertiser’s ads perform particularly well on some specific domains? What
  15217. if users in San Francisco are much more interested in some advertisers than others? What if a particular domain tends
  15218. to receive exceptionally valuable traffic from New York? One popular way of solving this problem at scale is to supplement
  15219. the standard linear model equation with an additional term to model pairwise feature interactions.&lt;/p&gt;
  15220.  
  15221. &lt;p&gt;The simplest such strategy is to learn a weight \(w_{ij}\) for each feature combination. We would then make our bid,
  15222. \(B_{quadratic}\), according to:&lt;/p&gt;
  15223.  
  15224. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  15225. %&lt;![CDATA[
  15226. \begin{equation}
  15227. B_{quadratic} = f\left(w_0 + \sum_{i=1}^n w_i x_i + \sum_{i=1}^n \sum_{j=i+1}^n w_{ij} x_i x_j\right)
  15228. \end{equation}
  15229. %]]&gt;
  15230. &lt;/script&gt;
  15231.  
  15232. &lt;p&gt;where \(\mathbf{W} \in \mathbb{R}^{n \times n}\) are additional parameters to be learned. Unfortunately, this naïve approach
  15233. will not work for two main reasons. First, the size of the model is now \(\mathcal{O}(n^2)\), which has terrible
  15234. implications for both the amount of memory needed to store the model, and the time it takes to train the model. Second,
  15235. our dataset is too sparse for us to learn the all of the weights \(\mathbf{W}\) reliably. That is, for almost all pairs
  15236. \((i,j)\) we would not have enough training examples to learn the weight \(w_{ij}\) well. To improve the pairwise feature
  15237. interaction modeling of \(B_{quadratic}\), we could include only the weights \(w_{ij}\) with corresponding
  15238. features sufficiently dense and informative. In the second iteration of our modeling, we did just this. The problem is
  15239. selecting the valuable weights \(w_{ij}\) is difficult to do algorithmically, and does not scale to a large number of possible
  15240. feature combinations.&lt;/p&gt;
  15241.  
  15242. &lt;p&gt;FM solves the problem of considering pairwise feature interactions. Indeed, it allows us to bid based on
  15243. reliable information from every pairwise combination of variables in the model. Just as important, FM allows us to do
  15244. this in a remarkably efficient way both in terms of both time and space complexity. So how exactly does FM work? FM models
  15245. pairwise feature interactions as the inner product of low dimensional vectors. More precisely, our bid with the FM model,
  15246. \(B_{FM}\), becomes:&lt;/p&gt;
  15247.  
  15248. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  15249. %&lt;![CDATA[
  15250. \begin{equation}
  15251. B_{FM} = f\left(w_0 + \sum_{i=1}^n w_i x_i + \sum_{i=1}^n \sum_{j=i+1}^n \left&lt;\vec{v}_i,\vec{v}_j \right&gt; x_i x_j\right)
  15252. \end{equation}
  15253. %]]&gt;
  15254. &lt;/script&gt;
  15255.  
  15256. &lt;p&gt;where \(\mathbf{V} \in \mathbb{R}^{n \times k}\) are additional parameters to be learned, and \(\vec{v}_i\) is the \(i\)th
  15257. row of \(\mathbf{V}\). Notice that the FM model replaces the weights \(w_{ij}\) by \(\left&amp;lt;\vec{v}_i,\vec{v}_j\right&amp;gt;\). From
  15258. a modeling perspective, this is powerful because each feature ends up embedded in an inner product space, with similar features
  15259. embedded near one another. As a result, the FM model is even able to learn the effect of interactions between features which
  15260. appear together very infrequently in the training data. Furthermore, the size of the FM model is now a more reasonable
  15261. \(\mathcal{O}(kn)\), where the latent dimension \(k\) is a hyperparameter of the FM model.&lt;/p&gt;
  15262.  
  15263. &lt;p&gt;However, the FM model is not an obvious improvement from a computation time perspective. In fact, computation of the pairwise
  15264. feature interaction term now &lt;em&gt;appears&lt;/em&gt; to require \(\mathcal{O}(kn^2)\) operations rather than the \(\mathcal{O}(n^2)\) of the naïve
  15265. interaction modeling solution. Yet this is not the case; after some manipulation we may rewrite the nonlinear FM term
  15266. as follows &lt;sup&gt;&lt;a href=&quot;#footnotes&quot;&gt;(1)&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;
  15267.  
  15268. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  15269. %&lt;![CDATA[
  15270. \begin{equation}
  15271. \sum_{i=1}^n \sum_{j=i+1}^n \left&lt;\vec{v}_i,\vec{v}_j \right&gt; x_i x_j =
  15272. \frac{1}{2} \sum_{j=1}^k \left( \left(\sum_{i=1}^n v_{ij}x_i \right)^2 - \sum_{i=1}^n v_{ij}^2 x_i^2 \right)
  15273. \end{equation}
  15274. %]]&gt;
  15275. &lt;/script&gt;
  15276.  
  15277. &lt;p&gt;The right hand side of this equation can clearly be computed in \(\mathcal{O}(kn)\) time. This is the magic of FM: &lt;em&gt;we are able to
  15278. compute the term that models all pairwise interactions in linear time.&lt;/em&gt; As a result, we are able to train the FM model in a time
  15279. proportional to the time needed to train the linear model.&lt;/p&gt;
  15280.  
  15281. &lt;p&gt;Apart from the optimization described above, we used several other strategies to reduce training time and improve convergence.
  15282. First, the FM model is trained using stochastic gradient descent (SGD), an algorithm known for its speed. &lt;sup&gt;&lt;a href=&quot;#footnotes&quot;&gt;(2)&lt;/a&gt;&lt;/sup&gt;
  15283. Second, we parallelized SGD using the lock-free so-called “HOGWILD!” scheme &lt;a href=&quot;#references&quot;&gt;[2]&lt;/a&gt;. This lock-free, parallel SGD is notable
  15284. because it avoids blocking any threads by allowing race conditions during model updates. Additionally, we found AdaGrad to be a very effective
  15285. learning-rate schedule for training an FM model with SGD &lt;a href=&quot;#references&quot;&gt;[3]&lt;/a&gt;. Finally, we used single instruction multiple data (SIMD)
  15286. computation to vectorize calculations involving the matrix \(\mathbf{V}\), quartering FM model training time. See below for a D programming
  15287. language (pseudocode) example of the vectorized calculation of the FM model equation. With the optimizations described we are able to train
  15288. an FM model in approximately the same amount of time it took to train our previous model. This is outstanding as the FM model is
  15289. significantly more predictive.&lt;/p&gt;
  15290.  
  15291. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-d&quot; data-lang=&quot;d&quot;&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FMModel&lt;/span&gt;
  15292. &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  15293.    &lt;span class=&quot;kt&quot;&gt;uint&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// FM latent dimension&lt;/span&gt;
  15294.    &lt;span class=&quot;kt&quot;&gt;float&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  15295.    &lt;span class=&quot;kt&quot;&gt;float&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w_vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  15296.    &lt;span class=&quot;n&quot;&gt;float4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[][]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v_matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  15297.  
  15298.    &lt;span class=&quot;c1&quot;&gt;// ... other FM model variables and functions ...&lt;/span&gt;
  15299.  
  15300.    &lt;span class=&quot;kt&quot;&gt;float&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;model_equation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Observation&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  15301.    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  15302.        &lt;span class=&quot;kt&quot;&gt;float&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;linear_term&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  15303.        &lt;span class=&quot;k&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Feature&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;feat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  15304.            &lt;span class=&quot;n&quot;&gt;linear_term&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w_vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;feat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  15305.  
  15306.        &lt;span class=&quot;n&quot;&gt;float4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;non_linear_term4&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  15307.        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;++)&lt;/span&gt;
  15308.        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  15309.            &lt;span class=&quot;n&quot;&gt;float4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;first_term4&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;toFloat4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  15310.            &lt;span class=&quot;n&quot;&gt;float4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;second_term4&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;toFloat4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  15311.            &lt;span class=&quot;k&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Feature&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;feat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  15312.            &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  15313.                &lt;span class=&quot;n&quot;&gt;float4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;update&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v_matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;feat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;feat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weight&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  15314.                &lt;span class=&quot;n&quot;&gt;first_term4&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  15315.                &lt;span class=&quot;n&quot;&gt;second_term4&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;update&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  15316.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  15317.            &lt;span class=&quot;n&quot;&gt;first_term4&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;first_term4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  15318.  
  15319.            &lt;span class=&quot;n&quot;&gt;non_linear_term4&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;first_term4&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;second_term4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  15320.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  15321.  
  15322.        &lt;span class=&quot;kt&quot;&gt;float&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;non_linear_term&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;non_linear_term4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum_float4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  15323.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;linear_term&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;non_linear_term&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  15324.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  15325. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15326.  
  15327. &lt;p&gt;Moving forward, we continue to search for modeling and model training improvements. If Factorization Machines or machine learning
  15328. are the types of things that interest you, please consider applying to &lt;a href=&quot;https://www.adroll.com/about/careers&quot;&gt;work at AdRoll&lt;/a&gt;!&lt;/p&gt;
  15329.  
  15330. &lt;h4 id=&quot;-references&quot;&gt;&lt;a name=&quot;references&quot;&gt;&lt;/a&gt; References&lt;/h4&gt;
  15331. &lt;ol&gt;
  15332.  &lt;li&gt;Rendle, Steffen. &lt;em&gt;Factorization Machines.&lt;/em&gt;&lt;/li&gt;
  15333.  &lt;li&gt;Recht, Benjamin and Re, Christopher and Wright, Stephen and Niu, Feng. &lt;em&gt;Hogwild: A Lock-Free Approach to Parallelizing
  15334. Stochastic Gradient Descent.&lt;/em&gt;&lt;/li&gt;
  15335.  &lt;li&gt;Duchi, John and Hazan, Elad and Singer, Yoram. &lt;em&gt;Adaptive Subgradient Methods for Online Learning and Stochastic
  15336. Optimization.&lt;/em&gt;&lt;/li&gt;
  15337. &lt;/ol&gt;
  15338.  
  15339. &lt;h4 id=&quot;-footnotes&quot;&gt;&lt;a name=&quot;footnotes&quot;&gt;&lt;/a&gt; Footnotes&lt;/h4&gt;
  15340. &lt;ol&gt;
  15341.  &lt;li&gt;For details, please see &lt;a href=&quot;#references&quot;&gt;[1]&lt;/a&gt;.&lt;/li&gt;
  15342.  &lt;li&gt;The main downside to using FM is the resulting optimization problem is no longer convex. As a result, many effective
  15343. optimization techniques are no longer at our disposal when learning the parameters \(\vec{w}\) and \(\mathbf{V}\) of the FM model.
  15344. Fortunately, SGD still works quite well in this non-convex setting.&lt;/li&gt;
  15345. &lt;/ol&gt;
  15346. </description>
  15347.    </item>
  15348.    
  15349.    
  15350.    
  15351.    <item>
  15352.      <title>Easy (and slightly crazy) way of writing bash scripts</title>
  15353.      <link>https://tech.nextroll.com/blog/terminal/2015/08/24/bash-command-runner.html</link>
  15354.      <pubDate>Mon, 24 Aug 2015 00:00:00 -0700</pubDate>
  15355.      <author></author>
  15356.      <guid isPermaLink="false">https://tech.nextroll.com/blog/terminal/2015/08/24/bash-command-runner</guid>
  15357.      <description>&lt;p&gt;I’ve always been interested in creating and improving our developer tools.
  15358. Things like &lt;a href=&quot;http://tech.adroll.com/blog/web/2014/03/05/adding-jscs-to-your-commit-hook.html&quot;&gt;git
  15359. hooks&lt;/a&gt;,
  15360. &lt;a href=&quot;https://chrome.google.com/webstore/detail/commit-filter-for-github/lbapcjnnpkmjkfdenpigkpmaelpgldmk?hl=en&quot;&gt;browser
  15361. extensions&lt;/a&gt; or &lt;a href=&quot;http://tech.adroll.com/blog/terminal/2014/09/26/introducing-aircontrol-control-airplay-through-terminal.html&quot;&gt;command line scripts&lt;/a&gt;.&lt;/p&gt;
  15362.  
  15363. &lt;p&gt;When I work on such tools, one of my goals is to make them easy but also
  15364. pleasant to use. “Pleasantness” typically comes in the form of being the least
  15365. verbose necessary, displaying nice and meaningful colors or making informed
  15366. assumptions.&lt;/p&gt;
  15367.  
  15368. &lt;p&gt;In this post, I talk about a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bash&lt;/code&gt; “command runner”. It allows me to easily
  15369. write informative and complicated scripts in a way that satisfies my
  15370. “pleasantness” requirement.&lt;/p&gt;
  15371.  
  15372. &lt;p&gt;To keep you reading, here it is in action:&lt;/p&gt;
  15373.  
  15374. &lt;p&gt;&lt;img src=&quot;/images/post_images/command-runner-demo.gif&quot; alt=&quot;command_runner in action&quot; /&gt;&lt;/p&gt;
  15375.  
  15376. &lt;h1 id=&quot;the-problem&quot;&gt;The problem&lt;/h1&gt;
  15377.  
  15378. &lt;p&gt;I recently worked on streamlining some steps of our internationalization
  15379. (a.k.a
  15380. “&lt;a href=&quot;http://en.wikipedia.org/wiki/Internationalization_and_localization#Naming&quot;&gt;i18n&lt;/a&gt;”)
  15381. process.&lt;/p&gt;
  15382.  
  15383. &lt;p&gt;For the most part, our process is as described by LingoHub in &lt;a href=&quot;http://blog.lingohub.com/2013/03/integrating-internationalization-in-a-git-workflow/&quot;&gt;this
  15384. post&lt;/a&gt;.
  15385. I.e. we have a branch named &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i18n&lt;/code&gt; that starts as a copy of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;master&lt;/code&gt;. All
  15386. feature branches are merged into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i18n&lt;/code&gt; when they are ready to be merged into
  15387. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;master&lt;/code&gt;. That lets us get translations started for all the feature branches
  15388. being reviewed.&lt;/p&gt;
  15389.  
  15390. &lt;p&gt;This process has a few recurring tasks for developers, like merging our current
  15391. branch into the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i18n&lt;/code&gt; branch or pushing translation files to an external API.
  15392. The two common themes with these tasks are that:&lt;/p&gt;
  15393.  
  15394. &lt;ul&gt;
  15395.  &lt;li&gt;They have a significant number of steps.&lt;/li&gt;
  15396.  &lt;li&gt;They involve git operations: checking a branch out, pulling from the remote,
  15397. merging, etc.&lt;/li&gt;
  15398. &lt;/ul&gt;
  15399.  
  15400. &lt;p&gt;I’ll use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;push&lt;/code&gt; example: our new strings are ready to be sent to
  15401. translators, so we need to compile our messages into a POT file and &lt;em&gt;push&lt;/em&gt; it to
  15402. &lt;a href=&quot;http://www.smartling.com/&quot;&gt;Smartling&lt;/a&gt;, the online software we use to manage
  15403. work with our translators.&lt;/p&gt;
  15404.  
  15405. &lt;p&gt;For that one task we need to run the following commands:&lt;/p&gt;
  15406.  
  15407. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;# Starting from any branch&lt;/span&gt;
  15408. &lt;span class=&quot;c&quot;&gt;# In case something has been changed:&lt;/span&gt;
  15409. git stash
  15410. git checkout i18n
  15411. git pull origin i18n
  15412. &lt;span class=&quot;c&quot;&gt;# Extract messages from codebase using PyBabel&lt;/span&gt;
  15413. python setup.py extract_messages
  15414. &lt;span class=&quot;c&quot;&gt;# Pushes to our translation SaaS (using separate command line tool)&lt;/span&gt;
  15415. smartling push
  15416. &lt;span class=&quot;c&quot;&gt;# Commit the whole thing&lt;/span&gt;
  15417. git add adroll/dotcom/i18n/adroll.pot
  15418. git commit &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;Update POT file&apos;&lt;/span&gt;
  15419. git push origin i18n
  15420. &lt;span class=&quot;c&quot;&gt;# Get back to previous state&lt;/span&gt;
  15421. git checkout -
  15422. git stash pop &lt;span class=&quot;c&quot;&gt;# maybe?&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15423.  
  15424. &lt;p&gt;Obviously, this is too many steps to type by hand every time. We’re programmers,
  15425. let’s automate this!&lt;/p&gt;
  15426.  
  15427. &lt;h2 id=&quot;the-naive-solution&quot;&gt;The naive solution&lt;/h2&gt;
  15428.  
  15429. &lt;p&gt;The first option would be to write a naive script: take all the above commands
  15430. in a file, append &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;#!/usr/env bash -e&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;chmod +x&lt;/code&gt; and you’re good to go!&lt;/p&gt;
  15431.  
  15432. &lt;p&gt;Well… until you realize that you first need to check that you &lt;em&gt;did&lt;/em&gt; stash
  15433. something, otherwise you’d be popping an unrelated stash. You also need to be
  15434. confident that all the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git&lt;/code&gt; commands will succeed. In reality, they can each
  15435. fail in many different ways. For example, maybe your local &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i18n&lt;/code&gt; branch is
  15436. ahead of your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;origin&lt;/code&gt;.&lt;/p&gt;
  15437.  
  15438. &lt;p&gt;With the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-e&lt;/code&gt; flag, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bash&lt;/code&gt; &lt;em&gt;will&lt;/em&gt; stop at the first problem, but then you need
  15439. to pick up where you left off and that naive script won’t let you do that.&lt;/p&gt;
  15440.  
  15441. &lt;h2 id=&quot;the-trying-to-be-smarter-solution&quot;&gt;The (trying-to-be-)smarter solution&lt;/h2&gt;
  15442.  
  15443. &lt;p&gt;A second option then is to write a smarter script: any time there’s ambiguity,
  15444. you check the exit code and keep track of the state of the commands. Something
  15445. like this:&lt;/p&gt;
  15446.  
  15447. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;#!/usr/env bash -e&lt;/span&gt;
  15448. git stash &lt;span class=&quot;c&quot;&gt;# this shoud be safe&lt;/span&gt;
  15449. git checkout i18n &lt;span class=&quot;c&quot;&gt;# this one too&lt;/span&gt;
  15450. git pull origin i18n &lt;span class=&quot;c&quot;&gt;# hmm, that could fail for a few reasons...&lt;/span&gt;
  15451.  
  15452. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$?&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 0 &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
  15453.    &lt;/span&gt;python setup.py extract_messages
  15454.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$?&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 0 &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
  15455.        &lt;/span&gt;smartling push &lt;span class=&quot;c&quot;&gt;# Pushes to our translation SaaS&lt;/span&gt;
  15456.  
  15457.        &lt;span class=&quot;c&quot;&gt;# ...now what?&lt;/span&gt;
  15458.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$?&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 0 &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
  15459.        &lt;/span&gt;git add adroll/dotcom/i18n/adroll.pot
  15460.        git commit &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;Update POT file&apos;&lt;/span&gt;
  15461.        git push origin i18n
  15462.        git checkout -
  15463.        git stash pop &lt;span class=&quot;c&quot;&gt;# maybe?&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15464.  
  15465. &lt;p&gt;This gets very hairy very fast. It’s not even clear how that script would be
  15466. smarter, and what to do with the error codes and how to handle errors
  15467. gracefully. And it still won’t let you go pick up where you left off. Plus, that
  15468. would be that one task, and I would need to do the same work for all of them.&lt;/p&gt;
  15469.  
  15470. &lt;p&gt;Another thing I dislike with the solutions above is that they’re very verbose.
  15471. These tasks should be the least disruptive possible for the developers’
  15472. workflow. They’re tasks that need to be done, yes. But they’re secondary and if
  15473. all goes well, I’d rather not see a screenful of output of the various commands.&lt;/p&gt;
  15474.  
  15475. &lt;h1 id=&quot;enter-command_runner&quot;&gt;Enter command_runner&lt;/h1&gt;
  15476.  
  15477. &lt;p&gt;Instead, I wrote &lt;a href=&quot;https://gist.github.com/Timothee/71e653a6a8e18ae17fb7&quot;&gt;a
  15478. script&lt;/a&gt; that gives me the
  15479. following:&lt;/p&gt;
  15480.  
  15481. &lt;ul&gt;
  15482.  &lt;li&gt;an easy way to create new tasks, without having to manage the various exit
  15483. codes of each step&lt;/li&gt;
  15484.  &lt;li&gt;if everything goes well, as little output as necessary&lt;/li&gt;
  15485.  &lt;li&gt;if anything goes wrong, a way to fix the issue and pick up where I left off&lt;/li&gt;
  15486. &lt;/ul&gt;
  15487.  
  15488. &lt;p&gt;It defines a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_commands&lt;/code&gt; function which takes a function name as an argument.
  15489. The function name is the name of a function that prints out all the commands,
  15490. one per line. E.g.&lt;/p&gt;
  15491.  
  15492. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;foo &lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15493.    &lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOL&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;
  15494.    ls
  15495.    ls foo
  15496.    cat foo
  15497. &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOL
  15498. &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15499.  
  15500. &lt;p&gt;Then you’d call that as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_commands foo&lt;/code&gt;.&lt;/p&gt;
  15501.  
  15502. &lt;p&gt;You can also pass it a step number as a second argument to start at that step
  15503. instead, allowing you to pick up where you left off after an error.&lt;/p&gt;
  15504.  
  15505. &lt;p&gt;E.g. in the above example, the commands would be numbered this way:&lt;/p&gt;
  15506.  
  15507. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;0. &lt;span class=&quot;nb&quot;&gt;ls
  15508. &lt;/span&gt;1. &lt;span class=&quot;nb&quot;&gt;ls &lt;/span&gt;foo
  15509. 2. &lt;span class=&quot;nb&quot;&gt;cat &lt;/span&gt;foo&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15510.  
  15511. &lt;p&gt;So, calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_commands foo 2&lt;/code&gt; would only call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cat foo&lt;/code&gt;.&lt;/p&gt;
  15512.  
  15513. &lt;hr /&gt;
  15514.  
  15515. &lt;p&gt;The meat of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_commands&lt;/code&gt; can be summed up in two parts:&lt;/p&gt;
  15516.  
  15517. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;    &lt;span class=&quot;k&quot;&gt;while &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;read&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; line&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  15518.        &lt;span class=&quot;c&quot;&gt;# do something with $line&lt;/span&gt;
  15519.    &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt; &amp;lt; &amp;lt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15520.  
  15521. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;(echo &quot;$($1)&quot;)&lt;/code&gt; makes a call to the function name given to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_commands&lt;/code&gt;,
  15522. captures its output and feeds it into the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;while&lt;/code&gt; loop. In that loop, each line
  15523. (i.e. command to run) can be handled.&lt;/p&gt;
  15524.  
  15525. &lt;p&gt;The second part is the handling of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$line&lt;/code&gt; in the loop. The barebone version is:&lt;/p&gt;
  15526.  
  15527. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;    &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$line&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  15528.    &lt;span class=&quot;nv&quot;&gt;OUT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$line&lt;/span&gt; 2&amp;gt;&amp;amp;1&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15529.  
  15530. &lt;p&gt;In other words, it’s printing the command as a string first, then using that
  15531. string in a command substitution to run it.&lt;/p&gt;
  15532.  
  15533. &lt;p&gt;This is where readers become split between the ones that think it’s pretty cool
  15534. and the ones that think I’m a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bash&lt;/code&gt; heretic for using a variable as both a
  15535. string and a command.&lt;/p&gt;
  15536.  
  15537. &lt;p&gt;It &lt;em&gt;can&lt;/em&gt; work but it’s limited and a bit unpredictable because of the way &lt;a href=&quot;http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_04.html&quot;&gt;shell
  15538. expansion&lt;/a&gt; works.&lt;/p&gt;
  15539.  
  15540. &lt;p&gt;Take this example:&lt;/p&gt;
  15541.  
  15542. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;foo&quot;&lt;/span&gt;
  15543. foo
  15544. &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;echo &lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;foo&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  15545. &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$a&lt;/span&gt;
  15546. &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;foo&quot;&lt;/span&gt;
  15547. &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$a&lt;/span&gt;
  15548. &lt;span class=&quot;s2&quot;&gt;&quot;foo&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15549.  
  15550. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;echo $a&lt;/code&gt; outputs exactly the first command, i.e. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;echo &quot;foo&quot;&lt;/code&gt;. But running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$a&lt;/code&gt;
  15551. and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;echo &quot;foo&quot;&lt;/code&gt; doesn’t end with the same result. This is due to how the
  15552. command is parsed and expanded. In this case, it’s very benign but it
  15553. illustrates the difficulties you can run into.&lt;/p&gt;
  15554.  
  15555. &lt;p&gt;For the same reason, some commands won’t work or will need some tweaking or
  15556. further wrapping-in-a-function. E.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git commit -m &quot;This will fail&quot;&lt;/code&gt; should be
  15557. wrapped into something like that:&lt;/p&gt;
  15558.  
  15559. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;commit &lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15560.    git commit &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;This will work&apos;&lt;/span&gt;
  15561. &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15562.  
  15563. &lt;p&gt;Then you’d use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;commit&lt;/code&gt; in your list of commands rather than &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git commit -m
  15564. &quot;foo&quot;&lt;/code&gt;.&lt;/p&gt;
  15565.  
  15566. &lt;hr /&gt;
  15567.  
  15568. &lt;p&gt;Beyond that, there’s more code to handle the exit code, stop if necessary, start
  15569. at a specific step, add nice colors, etc. You can check the full script out in
  15570. &lt;a href=&quot;https://gist.github.com/Timothee/71e653a6a8e18ae17fb7&quot;&gt;this gist&lt;/a&gt;.&lt;/p&gt;
  15571.  
  15572. &lt;h1 id=&quot;here-is-how-to-use-command_runner&quot;&gt;Here is how to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;command_runner&lt;/code&gt;:&lt;/h1&gt;
  15573.  
  15574. &lt;p&gt;Let’s work through it with a simple example. I don’t use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git rebase&lt;/code&gt; much and
  15575. instead I “merge forward”:&lt;/p&gt;
  15576.  
  15577. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;git checkout master
  15578. git pull
  15579. git checkout -
  15580. git merge -&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15581.  
  15582. &lt;p&gt;(of course, you can also just do &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git merge origin/master&lt;/code&gt; from your feature
  15583. branch but let’s use the long version for this example)&lt;/p&gt;
  15584.  
  15585. &lt;p&gt;1. Take the list of commands and wrap it in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cat&lt;/code&gt;, inside a function:&lt;/p&gt;
  15586.  
  15587. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;mergeforward &lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15588.    &lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOL&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;
  15589.    git checkout master
  15590.    git pull
  15591.    git checkout -
  15592.    git merge -
  15593. &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOL
  15594. &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15595.  
  15596. &lt;p&gt;2. You can then wrap the call to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_commands&lt;/code&gt; into your own script (let’s
  15597. call it &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git-mfw&lt;/code&gt;) like so:&lt;/p&gt;
  15598.  
  15599. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;#!/usr/bin/env bash&lt;/span&gt;
  15600. &lt;span class=&quot;nb&quot;&gt;source &lt;/span&gt;command_runner
  15601.  
  15602. mergeforward &lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15603.    &lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOL&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;
  15604.    git checkout master
  15605.    git pull
  15606.    git checkout -
  15607.    git merge -
  15608. &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOL
  15609. &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  15610.  
  15611. run_commands mergeforward &lt;span class=&quot;nv&quot;&gt;$@&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15612.  
  15613. &lt;p&gt;3. Call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git-mfw list&lt;/code&gt; to list the steps that are going to happen:&lt;/p&gt;
  15614.  
  15615. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./git-mfw list
  15616.  0.git checkout master
  15617.  1.git pull
  15618.  2.git checkout -
  15619.  3.git merge -&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15620.  
  15621. &lt;p&gt;4. Imagine you have a local change to some file and run the script:&lt;/p&gt;
  15622.  
  15623. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./git-mfw
  15624. ✓ 0.git checkout master
  15625. ✗ 1.git pull
  15626.  2.git checkout -
  15627.  3.git merge -
  15628.  
  15629. Command git pull failed with &lt;span class=&quot;nb&quot;&gt;exit &lt;/span&gt;code 1
  15630. Output:
  15631. From github.com:adroll/test-branch
  15632. error: Your &lt;span class=&quot;nb&quot;&gt;local &lt;/span&gt;changes to the following files would be overwritten by merge:
  15633. foo/bar/modified.py
  15634. Please, commit your changes or stash them before you can merge.
  15635. Aborting
  15636. Updating 1df3f18..efa27b&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15637.  
  15638. &lt;p&gt;5. You would then correct the issue and start where you left off by giving the
  15639. step number to start at:&lt;/p&gt;
  15640.  
  15641. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;git reset &lt;span class=&quot;nt&quot;&gt;--hard&lt;/span&gt;
  15642. &lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./git-mfw 1
  15643. ~ 0.git checkout master
  15644. ✓ 1.git pull
  15645. ✓ 2.git checkout -
  15646. ✓ 3.git merge -&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15647.  
  15648. &lt;p&gt;6. It’s trivial to extend your script by adding one function per task you need
  15649. to accomplish. In the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i18n&lt;/code&gt; example, we ended up with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;push&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pull&lt;/code&gt; and
  15650. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;merge&lt;/code&gt;.&lt;/p&gt;
  15651.  
  15652. &lt;h1 id=&quot;run-commands-run&quot;&gt;Run, commands, run!&lt;/h1&gt;
  15653.  
  15654. &lt;p&gt;While working on this, I came across a lot of webpages that warned against
  15655. wanting to both print a command and execute it. And for good reasons! It’s
  15656. obscure, prone to errors and can potentially harm your system if you’re not
  15657. careful. (to be fair, a lot of things happening in your terminal can)&lt;/p&gt;
  15658.  
  15659. &lt;p&gt;Like anything, it has limitations but overall it’s been incredibly useful! You
  15660. can write a nice-looking script that benefits your whole team very &lt;em&gt;very&lt;/em&gt; fast
  15661. and you’ll be a hero.&lt;/p&gt;
  15662.  
  15663. </description>
  15664.    </item>
  15665.    
  15666.    
  15667.    
  15668.    <item>
  15669.      <title>How We Transparently Track NPS</title>
  15670.      <link>https://tech.nextroll.com/blog/adroll/2015/08/05/how-we-nps.html</link>
  15671.      <pubDate>Wed, 05 Aug 2015 00:00:00 -0700</pubDate>
  15672.      <author></author>
  15673.      <guid isPermaLink="false">https://tech.nextroll.com/blog/adroll/2015/08/05/how-we-nps</guid>
  15674.      <description>&lt;p&gt;It has been a long time outstanding, “to do item”, to start collecting feedback and net promoter scores from the tens of thousands of customers that AdRoll has. Once at scale, it becomes surprisingly challenging to ensure that feedback from users is sent to Product, Engineering and SMB teams.&lt;/p&gt;
  15675.  
  15676. &lt;p&gt;Last month we turned &lt;a href=&quot;https://satismeter.com/&quot;&gt;Satismeter&lt;/a&gt; on via our &lt;a href=&quot;http://segment.com&quot;&gt;Segment&lt;/a&gt; dashboard. Satismeter injects a prompt at the bottom of our Dashboard asking how likely you are to refer AdRoll to a friend - in order to gather what’s more formally known as a Net Promoter Score or NPS. It looks like this:&lt;/p&gt;
  15677.  
  15678. &lt;p&gt;&lt;img src=&quot;/images/post_images/nps-prmpt.png&quot; alt=&quot;Image&quot; /&gt;&lt;/p&gt;
  15679.  
  15680. &lt;p&gt;Instantly we were deluged with feedback from our customers. The issue was that this feedback was now contained within Satismeter, but I wanted it in front of everyones faces. We did two cool things:&lt;/p&gt;
  15681.  
  15682. &lt;ul&gt;
  15683.  &lt;li&gt;
  15684.    &lt;p&gt;Firstly, we sent the NPS comments &lt;em&gt;back&lt;/em&gt; to Segment. This ensured that the data was then sent to every product we were using via Segment. For example, Customer Success and SMB teams can now see the comments and scores in &lt;a href=&quot;http://intercom.io&quot;&gt;Intercom&lt;/a&gt;, Zendesk. We can even now ensure that those users receive different emails via &lt;a href=&quot;http://customer.io&quot;&gt;customer.io&lt;/a&gt; and different modals via &lt;a href=&quot;http://appcues.com&quot;&gt;Appcues&lt;/a&gt;. Maybe very satisfied customers are the perfect testing cohort for some of our Growth experiments.&lt;/p&gt;
  15685.  &lt;/li&gt;
  15686.  &lt;li&gt;
  15687.    &lt;p&gt;We also sent this data into our #product &lt;a href=&quot;http://slack.com&quot;&gt;Slack&lt;/a&gt; channel. This channel is one of our highest population channels, with approx. 120 people in it. The virtues and extraordinary execution of Slack is a post for another day, but it is worth highlighting that this channel has everyone in product in it, it has all of the executive team, it has our VP of SMB, it has our entire customer support team in it and dozens of other highly relevant teams.&lt;/p&gt;
  15688.  &lt;/li&gt;
  15689. &lt;/ul&gt;
  15690.  
  15691. &lt;p&gt;&lt;img src=&quot;/images/post_images/nps-comment.png&quot; alt=&quot;Image&quot; /&gt;&lt;/p&gt;
  15692.  
  15693. &lt;p&gt;Short and sweet.&lt;/p&gt;
  15694.  
  15695. &lt;p&gt;It was trivial to set up. We created a Zapier account, configured a web-hook zap, and placed that URL in the Satismeter Web-hooks panel:&lt;/p&gt;
  15696.  
  15697. &lt;p&gt;&lt;img src=&quot;/images/post_images/nps-webhook.png&quot; alt=&quot;Image&quot; /&gt;&lt;/p&gt;
  15698.  
  15699. &lt;p&gt;We only wanted scores &lt;em&gt;with comments&lt;/em&gt; to be sent to Slack, a trivial custom filter:&lt;/p&gt;
  15700.  
  15701. &lt;p&gt;&lt;img src=&quot;/images/post_images/nps-filter.png&quot; alt=&quot;Image&quot; /&gt;&lt;/p&gt;
  15702.  
  15703. &lt;p&gt;And this was the output for Zapier:&lt;/p&gt;
  15704.  
  15705. &lt;p&gt;&lt;img src=&quot;/images/post_images/nps-output.png&quot; alt=&quot;Image&quot; /&gt;&lt;/p&gt;
  15706.  
  15707. &lt;p&gt;A few things to note that I think are cool. If you click the email address, it loads that user in Intercom. This allows you to always see if someone else has reached out to the user in a central place - much much better than just opening in your email client. We also made the avatar an image of Grandpa Simpson, because… The Simpsons.&lt;/p&gt;
  15708.  
  15709. &lt;p&gt;To recap:&lt;/p&gt;
  15710.  
  15711. &lt;ul&gt;
  15712.  &lt;li&gt;We collect NPS and feedback&lt;/li&gt;
  15713.  &lt;li&gt;Most importantly, that feedback is outputted in a highly visible Slack channel&lt;/li&gt;
  15714.  &lt;li&gt;We’ve empowered users to reach out and fix things via Intercom&lt;/li&gt;
  15715.  &lt;li&gt;Segment has allowed us to propagate this data everywhere relevant&lt;/li&gt;
  15716.  &lt;li&gt;The Simpsons rules.&lt;/li&gt;
  15717. &lt;/ul&gt;
  15718.  
  15719. </description>
  15720.    </item>
  15721.    
  15722.    
  15723.    
  15724.    <item>
  15725.      <title>How We Built the AdRoll Growth Team</title>
  15726.      <link>https://tech.nextroll.com/blog/adroll/2015/07/27/how-we-built-the-adroll-growth-team.html</link>
  15727.      <pubDate>Mon, 27 Jul 2015 00:00:00 -0700</pubDate>
  15728.      <author></author>
  15729.      <guid isPermaLink="false">https://tech.nextroll.com/blog/adroll/2015/07/27/how-we-built-the-adroll-growth-team</guid>
  15730.      <description>&lt;p&gt;Eight years ago, AdRoll was founded on this principle: to make display advertising work for everyone. As we’ve grown to six offices and over 25,000 advertisers worldwide, our challenge has been to make sure that we continue to identify and respond to the needs of businesses large and small.&lt;/p&gt;
  15731.  
  15732. &lt;p&gt;The Growth team was created to ensure that as AdRoll continues to build out more great products for marketers in more regions, we stay true to our founding philosophies. As we ourselves grow, we want the SMB advertisers that make up the core of our business to grow along with us.&lt;/p&gt;
  15733.  
  15734. &lt;p&gt;&lt;b&gt;Peter&lt;/b&gt; - I grew up in England, studied Computer Science, and then went through Y Combinator. I’m the product manager. &lt;br /&gt;
  15735. &lt;b&gt;Etan&lt;/b&gt; - Formerly a Data Scientist at Boost Media, have a love for both theoretical and applied mathematics. Business Analyst Support for Growth &lt;br /&gt;
  15736. &lt;b&gt;Anil&lt;/b&gt; - I’m a software engineer that was one of the founding employees at Udemy. I’m Turkish. &lt;br /&gt;
  15737. &lt;b&gt;Matt&lt;/b&gt; - AdRoll acquired my mobile analytics company, before that I worked as a software engineer in the finance industry. I’m Australian. &lt;br /&gt;
  15738. &lt;b&gt;Patrick&lt;/b&gt; - VP of Engineering that has scaled a number of fast growing companies since I moved to the US from Ireland. &lt;br /&gt;&lt;/p&gt;
  15739.  
  15740. &lt;p&gt;Here’s a recap of our first 30 days:&lt;/p&gt;
  15741.  
  15742. &lt;p&gt;&lt;b&gt;Q: How did you build the team?&lt;/b&gt;&lt;br /&gt;
  15743. &lt;b&gt;Patrick:&lt;/b&gt; Our first task was to identify the right people. How do you pull together a team that will be effective, efficient, and enthusiastic in meeting this challenge? One that both cares about this problem, and is empowered to solve it?&lt;/p&gt;
  15744.  
  15745. &lt;p&gt;Everyone in our org cares about our customers but we needed to find individuals that also had a desire and drive to produce new solutions for persistent problems, and who get things done quickly. We needed a team that could experiment, test, implement, reflect, and improve at extreme velocity. Because we wanted this team to move fast while working with many organizations in the company, we wanted individuals who could operate under the philosophy: “It’s easier to ask forgiveness, than it is to get permission”.&lt;/p&gt;
  15746.  
  15747. &lt;p&gt;&lt;b&gt;Q: How would you define the problem this team is solving for?&lt;/b&gt;&lt;br /&gt;
  15748. &lt;b&gt;Peter:&lt;/b&gt; Our overarching themes are to reduce friction during initial campaign setup, provide ongoing support, and increase lifetime value. For small to medium size businesses (SMB), you don’t have a lot of leeway to allow for an account not being set up correctly. In other words, the negative impact of having a campaign that is setup sub-optimally can be very significant for an SMB advertiser. We want to make sure that small businesses get the most out of the platform by utilizing the best of what AdRoll has to offer, in the best ways it can offer.&lt;/p&gt;
  15749.  
  15750. &lt;p&gt;&lt;b&gt;Q: What are the KPIs, and what are the things we want to focus on? What does a successful result look like?&lt;/b&gt;&lt;br /&gt;
  15751. &lt;b&gt;Etan:&lt;/b&gt; Our first step to start answering these questions was to go through the exercise of identifying the signs and signals that make a successful customer. We sat down to identify the customers within the SMB segment that were ‘successful’ vs. ones that were not, then had to describe what exactly that success looks like.&lt;/p&gt;
  15752.  
  15753. &lt;p&gt;For example, we found that customers that were using more than one product, had multiple ads uploaded, and had mobile ad sizes saw stronger campaign performance and lower churn. Having asked the question, “What does a successful customer look like?”, we could better measure whether our efforts were being effective down the line, according to these standards.&lt;/p&gt;
  15754.  
  15755. &lt;p&gt;&lt;b&gt;Q: How did you prioritize the different needs to address? How did you decide where to start?&lt;/b&gt;&lt;br /&gt;
  15756. &lt;b&gt;Etan:&lt;/b&gt; Once we had characterized our picture of the successful SMB customer, our next step was to identify the pain points within the product offering that could be impeding advertisers from getting started or achieving outstanding results. These were anything that might be causing friction or problems for clients. For example, the early conversation of measuring return on investment for the user is dependent on specific tracking having been attributed by the customer. Without the ability to understand the value of the AdRoll service, this naturally led to churn and unhappiness.&lt;/p&gt;
  15757.  
  15758. &lt;p&gt;Having identified these pain points, we then tried to map them against retention. Essentially, we plotted a 1:1 relationship between negative events and churn, and used this to direct our product roadmap as an immediate way to prioritize different pain points.&lt;/p&gt;
  15759.  
  15760. &lt;p&gt;&lt;b&gt;Q: How did you go about this?&lt;/b&gt;&lt;br /&gt;
  15761. &lt;b&gt;Anil:&lt;/b&gt; We started by imagining what it would be like to land on the AdRoll dashboard as a first-time advertiser. In doing this, there were a number of things that came to our attention that could be potentially significant challenges for someone who has never set up an advertising campaign. There were certain things that could be made more intuitive, and a lot of opportunity to offer additional guidance and help small business owners set up for success.&lt;/p&gt;
  15762.  
  15763. &lt;p&gt;We worked closely with our SMB team to learn about the most common problems they encounter. Our strategy was data plus conversation - collecting information both quantitatively, and through our own experience in trying to navigate the platform.&lt;/p&gt;
  15764.  
  15765. &lt;p&gt;&lt;b&gt;Q: In your opinion, what is growth trying to do?&lt;/b&gt;&lt;br /&gt;
  15766. &lt;b&gt;Peter:&lt;/b&gt; We want to increase the threshold for which we give dedicated service. Our hope is to reach an ideal state of frictionless campaign setup flow - which is to allow people to utilize and setup the product themselves, to full potential.&lt;/p&gt;
  15767.  
  15768. &lt;p&gt;&lt;b&gt;Matt:&lt;/b&gt; People equate early-stage startups with their founding philosophies, and what the company is trying to build. Generally, as a company grows in size, it becomes harder to ingrain these philosophies into every new employee, and there’s a lot of day-to-day work that is done without these founding principles in mind.&lt;/p&gt;
  15769.  
  15770. &lt;p&gt;For us, we’re trying to ensure that everything we do is, in fact, aligned with this goal of what we want to build. Velocity is key - we want to address the challenges that advertisers are coming up against, now. The customer success signals we identified help keep us on track and recognize when we’ve gone awry; we use them to adjust and react.&lt;/p&gt;
  15771.  
  15772. &lt;p&gt;&lt;b&gt;Q: How does this fit back into meeting the needs of small businesses?&lt;/b&gt;&lt;br /&gt;
  15773. &lt;b&gt;Peter:&lt;/b&gt; There are a lot of small improvements we can make that can drive significant results for SMB advertisers. For these businesses, a small improvement in campaign setup can mean huge business impact as far as % gains. For small businesses that are just getting started and trying to reach and retain new customers, companies like AdRoll can actually have a significant impact on that business’s success, if utilized correctly.&lt;/p&gt;
  15774.  
  15775. &lt;p&gt;&lt;b&gt;Q: What makes this team unique? What makes it work?&lt;/b&gt;&lt;br /&gt;
  15776. &lt;b&gt;Peter:&lt;/b&gt; We’re operating from a different perspective than other teams in the organization. Rather than focusing on a single product, we’re focused on a single problem. So, we’re able to see the larger picture and push out more updates, more quickly, to solve for this problem. There’s also more latitude to adjust our strategy and what we’re working on, to make sure we’re in line with our vision.&lt;/p&gt;
  15777.  
  15778. &lt;p&gt;&lt;b&gt;What does success for this team look like 1 year from now?&lt;/b&gt;&lt;br /&gt;
  15779. One of the nice things about Growth is that it is so data driven. In fact, if you can’t track something then a Growth team should not really do it in the first place. So when we look at Growth 1 year from now, we’ll be comparing how users are signing up and activating and sticking around compared with today. One of the big metrics that we track each month is a simple one: how many people that signup go on to spend? In my opinion this metric alone speaks for the health of a Growth team. It conveniently thinks about all touch points of onboarding, which is really a key part of Growth.&lt;/p&gt;
  15780.  
  15781. &lt;p&gt;Given how international the team is, I think 3 green cards would be appreciated too. Another easy metric to track.&lt;/p&gt;
  15782. </description>
  15783.    </item>
  15784.    
  15785.    
  15786.    
  15787.    <item>
  15788.      <title>Streaming Petabytes of Data in Realtime with Kinesis</title>
  15789.      <link>https://tech.nextroll.com/blog/data/2015/06/26/kinesis.html</link>
  15790.      <pubDate>Fri, 26 Jun 2015 00:00:00 -0700</pubDate>
  15791.      <author></author>
  15792.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2015/06/26/kinesis</guid>
  15793.      <description>&lt;p&gt;AdRoll’s real-time data pipeline drives many systems crucial to our business. These systems inform the decisions made by our real-time bidding infrastructure in response to the tens of billions of daily requests we receive from ad exchanges. They also refresh our predictive models, guard against overspend, and deliver up to the minute campaign metrics through our dashboard. To function properly these systems need to maintain 100% uptime while inhaling about 1.3 petabytes of data per month across 5 regions worldwide.&lt;/p&gt;
  15794.  
  15795. &lt;p&gt;We’ve found that improving real-time data recency directly correlates with improved performance and is an enabler for future features and products. As such, we recently found ourselves motivated to overhaul our real-time pipeline in an effort to reduce end to end latency as much as possible. By leveraging Amazon’s scalable data streaming service &lt;a href=&quot;http://aws.amazon.com/kinesis/&quot;&gt;Kinesis&lt;/a&gt; we were able to drop our real-time pipeline’s processing latency from 10 minutes to just under 3 seconds (while simultaneously cutting costs and improving system stability!).&lt;/p&gt;
  15796.  
  15797. &lt;p&gt;Our pipeline involves feeding event logs generated by our real-time bidders and ad-servers into a set of &lt;a href=&quot;https://storm.apache.org/&quot;&gt;Storm&lt;/a&gt; topologies and a few other proprietary applications which do our real-time processing. Originally, we ingested data by reading log files rotated to S3 every few minutes from each bidder and ad-server. But this mode of ingestion caused a tension between S3 performance and data processing latency - flushing logs more often improves recency but yields a larger number of smaller log files, which degrades list and read performance. We found that in practice we were bounded by a 10 minute average processing delay with this setup.&lt;/p&gt;
  15798.  
  15799. &lt;p&gt;&lt;img src=&quot;/images/post_images/kinesis_architecture_1.png&quot; alt=&quot;Old Pipeline&quot; /&gt;&lt;/p&gt;
  15800.  
  15801. &lt;p&gt;To break the 10 minute recency barrier we needed a continuous data streaming service. But not just any off the shelf solution would suffice. Whatever we chose would have to be able to reliably handle our &lt;a href=&quot;http://tech.adroll.com/blog/adtech/2014/11/24/adroll-puts-the-p-in-big-data.html&quot;&gt;volume&lt;/a&gt; while providing redundancy/fault tolerance. Additionally, we would need this service deployed near each of our geographically distributed data centers. The motivation here is to minimize data loss in the event of a machine failure – we wouldn’t be able to offload logs as frequently or as reliably with transcontinental network calls involved. Presumably, once in our streaming service, data will be more durable, so our real-time processing applications (which live in a single region) would be better suited to stream logs across oceans.&lt;/p&gt;
  15802.  
  15803. &lt;p&gt;&lt;img src=&quot;/images/post_images/kinesis_architecture_2.png&quot; alt=&quot;New Kinesis&quot; /&gt;&lt;/p&gt;
  15804.  
  15805. &lt;p&gt;Initially we explored Kafka but we found it difficult to configure properly and it fell over every time we tried to scale. We kept running into issues with tasks like setting up cross datacenter mirroring and allocating the appropriate amount of disk space per topic. Eventually we started looking for an easier to manage solution.&lt;/p&gt;
  15806.  
  15807. &lt;p&gt;Enter Kinesis. Very similar to Kafka, Kinesis is a partitioned streaming service. In Kinesis data producers push small data blobs called records into streams which are broken into partitions called shards. Kinesis has built in redundancy and is arbitrarily scalable by design so it can handle our volume. And it’s an AWS service so we get streaming locally to the regions where our log producers reside for free because all of our infrastructure already lives on AWS. Also, it’s a managed service so don’t have to worry about configuring and tuning and we won’t be losing any sleep from ops work!&lt;/p&gt;
  15808.  
  15809. &lt;p&gt;We were able to get up and running on Kinesis fairly quickly with some plug and play libraries. On our log producers, which are written in Erlang, we used &lt;a href=&quot;https://github.com/AdRoll/kinetic&quot;&gt;Kinetic&lt;/a&gt;, an open source Erlang Kinesis client developed here at AdRoll. Kinetic runs on a logging process which lives in each of our real-time bidders and ad-servers and handles about 5,000 logs per second (per machine). We designed this process to flush data in small batches which are formed by concatenating logs until either Kinesis’s 1MB record size limit is reached or 1 second has passed since the last flush. This allows us to minimize put record requests which are expensive due to network calls …and the fact that we’re literally &lt;a href=&quot;http://aws.amazon.com/kinesis/pricing/&quot;&gt;billed by Amazon for them&lt;/a&gt;.&lt;/p&gt;
  15810.  
  15811. &lt;p&gt;On the Kinesis consumer side, we connected our Storm topologies to Kinesis by forking Amazon’s &lt;a href=&quot;https://github.com/awslabs/kinesis-storm-spout&quot;&gt;Kinesis storm spout&lt;/a&gt;. For those unfamiliar with Storm, spouts are components which ingest data from outside sources and send it to other Storm components to be processed. Out of the box we found the spout streamed data too slowly because polling from Kinesis was tightly coupled with Storm’s requests for the spout to emit data. This is shown in the (simplified) snippet of code from Amazon’s library below:&lt;/p&gt;
  15812.  
  15813. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot; data-lang=&quot;java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// poll records for a Kinesis stream shard&lt;/span&gt;
  15814. &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;KinesisShardGetter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15815. &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
  15816.  &lt;span class=&quot;nc&quot;&gt;AmazonKinesisClient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kinesisClient&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
  15817. &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
  15818.  &lt;span class=&quot;c1&quot;&gt;// read and emit Kinesis records&lt;/span&gt;
  15819.  &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Record&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getNext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;numRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15820.    &lt;span class=&quot;nc&quot;&gt;GetRecordsRequest&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;request&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;GetRecordsRequest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
  15821.    &lt;span class=&quot;n&quot;&gt;request&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setLimit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;numRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
  15822.    &lt;span class=&quot;nc&quot;&gt;GetRecordsResult&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kinesisClient&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;request&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
  15823.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
  15824.  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  15825. &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
  15826. &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15827.  
  15828. &lt;p&gt;In the vanilla library, Storm requests data from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;KinesisShardGetter&lt;/code&gt; objects via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;getNext(...)&lt;/code&gt; and each of these requests kicks off a slow, blocking network call to Kinesis made by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kinesisClient&lt;/code&gt;. This doesn’t scale well and a single spout will quickly reach its network bounded throughput cap even when consuming from a small number of shards. To make the spout more performant we wrote our own &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;KinesisShardGetter&lt;/code&gt; shown (simplified) below.&lt;/p&gt;
  15829.  
  15830. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot; data-lang=&quot;java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// poll and buffer records for a Kinesis stream shard&lt;/span&gt;
  15831. &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;KinesisAsyncShardGetter&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;KinesisShardGetter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15832. &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
  15833.  &lt;span class=&quot;nc&quot;&gt;AmazonKinesisAsyncClient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kinesisClient&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
  15834.  &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ConcurrentLinkedQueue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Record&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;recordsQueue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
  15835. &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
  15836.  &lt;span class=&quot;c1&quot;&gt;// read Kinesis records&lt;/span&gt;
  15837.  &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;readRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15838.    &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;GetRecordsRequest&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;request&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;GetRecordsRequest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
  15839.    &lt;span class=&quot;n&quot;&gt;kinesisClient&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getRecordsAsync&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;request&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AsyncHandler&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;GetRecordsRequest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;GetRecordsResult&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15840.  
  15841.      &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
  15842.      &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;onSuccess&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;GetRecordsRequest&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;request&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;GetRecordsResult&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15843.        &lt;span class=&quot;n&quot;&gt;recordsQueue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addAll&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
  15844.        &lt;span class=&quot;n&quot;&gt;readRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
  15845.      &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  15846.      &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
  15847.    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  15848.  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  15849. &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
  15850.  &lt;span class=&quot;c1&quot;&gt;// emit Kinesis records&lt;/span&gt;
  15851.  &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
  15852.  &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Record&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getNext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;numRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15853.    &lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Record&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;records&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ArrayList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
  15854.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;numRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15855.      &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recordsQueue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;isEmpty&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  15856.        &lt;span class=&quot;n&quot;&gt;records&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recordsQueue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;pop&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
  15857.      &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  15858.    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  15859.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;records&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
  15860.  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  15861. &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
  15862. &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  15863.  
  15864. &lt;p&gt;To avoid a blocking Kinesis network call with each &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;getNext(...)&lt;/code&gt; we asynchronously poll and buffer data from Kinesis in a background thread via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;readRecords()&lt;/code&gt;. This way we need only &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recordsQueue.pop()&lt;/code&gt; to emit data into Storm from memory. This allowed us to increase the spout’s throughput to the point where we were bounded by data preprocessing instead of network latency (which was around a 10x speed boost for us).&lt;/p&gt;
  15865.  
  15866. &lt;p&gt;The final deployment hurdle we had to clear was setting up some Kinesis stream management tooling/automation – namely we wanted to be able to quickly and easily create or scale all of our streams across all our AWS regions. We tried using an existing &lt;a href=&quot;https://github.com/awslabs/amazon-kinesis-scaling-utils&quot;&gt;AWS labs tool&lt;/a&gt; but it seemed that it’s requests would rarely succeed due to poor error handling. We ended up just writing our own.&lt;/p&gt;
  15867.  
  15868. &lt;p&gt;Programatically creating streams was a fairly straight forward process but very slow (it takes around 30 minutes to set up all our streams in serial). We made sure to thread create requests such that we were hitting the 5 simultaneous Kinesis stream operations per account limit.&lt;/p&gt;
  15869.  
  15870. &lt;p&gt;Scaling streams was a bit of a headache because the current API makes it a very manual process. For some context, each Kinesis stream has a hash key space which is broken into sections, one for each shard. Each put request is given a random value in the stream’s hash key space to determine which shard the request will be sent to. Typically all shards own an equal slice of the hash key space so requests will be distributed evenly across all shards.&lt;/p&gt;
  15871.  
  15872. &lt;p&gt;Ideally streams would be scaled by a single API call specifying a stream and a target number of shards and Amazon would handle splitting the hash key space into new, evenly sized sections. However, in the current API you are required to split and merge sections of the hash key space manually, each split and merge operation being a separate request. Moreover you need to specify the actual point in the hash key space where you want to draw new split boundaries. Properly adding a single shard to a stream with two shards takes three distinct requests! For example, given a stream with two shards owning sections &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1-15], [16-30]&lt;/code&gt; of hash key space &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1-30]&lt;/code&gt;, to add a new shard – yielding sections &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1,10], [11, 20], [21,30]&lt;/code&gt; –  you would need to:&lt;/p&gt;
  15873.  
  15874. &lt;blockquote&gt;
  15875.  &lt;ol&gt;
  15876.    &lt;li&gt;Split shard &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1,15]&lt;/code&gt; at 10, into shards &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1,10]&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[11, 15]&lt;/code&gt;&lt;/li&gt;
  15877.    &lt;li&gt;Split shard &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[16,30]&lt;/code&gt; at 20, into shards &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[16,20]&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[21, 30]&lt;/code&gt;&lt;/li&gt;
  15878.    &lt;li&gt;Merge shards &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[11,15]&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[16,20]&lt;/code&gt; into new shard &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[11,20]&lt;/code&gt;&lt;/li&gt;
  15879.  &lt;/ol&gt;
  15880. &lt;/blockquote&gt;
  15881.  
  15882. &lt;p&gt;…I suggest avoiding scaling between relatively prime numbers of shards (stick to powers of two).&lt;/p&gt;
  15883.  
  15884. &lt;p&gt;Annoyingly, split or merge requests involving unrelated shards can’t be run at the same time on the same stream. So each operation must be performed in serial and scaling a stream is very slow. Again, we made sure to thread our scaling logic to operate on multiple streams simultaneously in the event that we needed to quickly respond to a global traffic increase.&lt;/p&gt;
  15885.  
  15886. &lt;p&gt;Eventually we got all of the migration details sorted out, and we hit the switch to turn on Kinesis. Immediately saw our event processing latency drop from 10 minutes to just under 3 seconds. Also, continuously streaming data smoothed out a few previously bursty workloads (can you spot the Kinesis pipeline release from the graph below of our DynamoDB writes?). This improved system stability and allowed us to cut our costs nearly in half for managed services that bill based on provisioned throughput such as DynamoDB. These savings actually more than paid for the cost of using Kinesis!&lt;/p&gt;
  15887.  
  15888. &lt;p&gt;&lt;img src=&quot;/images/post_images/kinesis_dynamo.png&quot; alt=&quot;New Kinesis&quot; /&gt;&lt;/p&gt;
  15889.  
  15890. &lt;p&gt;We ran into one fairly painful Kinesis service constraint though. The scale of each shard in Kinesis is based on throughput. Each shard provides a maximum write capacity of 1MB/s and a maximum read capacity of 2MB/s. This means that unless you overprovision or use multiple layers of streams you are limited to having two distinct real-time applications consuming from a Kinesis stream at any given time. Because we could potentially have around 10 real-time applications trying to consume from the same Kinesis stream we decided not to try to expose our real-time data directly from Kinesis.&lt;/p&gt;
  15891.  
  15892. &lt;p&gt;In the past S3 had worked well for serving data to many concurrent consumers but it had become a performance bottleneck because we were writing out a separate log file for each of our 500+ log producers. Once we had Kinesis, however, we were able to write an application which aggregates logs from all of our data producers in a single location – allowing us to flush all logs to S3 in a single file. By doing this we can flush much more frequently and therefore expose much fresher data to all of our real-time applications without sacrificing S3 performance. By using this Kinesis to S3 architecture we were able to achieve an end to end latency of about 30 seconds for all real-time applications. Also the aggregator application only takes up one Kinesis consumer application slot so we were able to have one other application which was particularly sensitive to recency connect directly to Kinesis and enjoy the aforementioned 3 second latency.&lt;/p&gt;
  15893.  
  15894. &lt;p&gt;&lt;img src=&quot;/images/post_images/kinesis_architecture_3.png&quot; alt=&quot;Old Pipeline&quot; /&gt;&lt;/p&gt;
  15895.  
  15896. &lt;p&gt;Overhauling our data pipeline with Kinesis ended up being very fruitful and relatively painless. We were able to expose massive recency gains to our entire real-time data pipeline while at the same time improving system stability and cutting costs.&lt;/p&gt;
  15897.  
  15898. &lt;p&gt;At AdRoll we’re looking for ways to improve our backend systems. If you have some ideas or have been itching to work with huge amounts of data, &lt;a href=&quot;https://www.adroll.com/about/careers/open-positions&quot;&gt;let us know&lt;/a&gt;!&lt;/p&gt;
  15899. </description>
  15900.    </item>
  15901.    
  15902.    
  15903.    
  15904.    <item>
  15905.      <title>Tech Talks: Lee Byron on Immutable.js</title>
  15906.      <link>https://tech.nextroll.com/blog/adroll/2015/05/13/tech-talk-immutable-lee-byron.html</link>
  15907.      <pubDate>Wed, 13 May 2015 00:00:00 -0700</pubDate>
  15908.      <author></author>
  15909.      <guid isPermaLink="false">https://tech.nextroll.com/blog/adroll/2015/05/13/tech-talk-immutable-lee-byron</guid>
  15910.      <description>&lt;p&gt;Since June 2014, the AdRoll Engineering team has been hosting Tech Talks every
  15911. other Tuesday – give or take – where members of our team share about a
  15912. technology, project, or tool they’re excited about. We’ve been lucky enough to
  15913. have a number of developers from within the larger SF / Bay Area community stop
  15914. by our offices for these Tuesday “Lunch and Learn” hours to share new and
  15915. interesting projects they’re pursuing, and now we’d like to share them with you!&lt;/p&gt;
  15916.  
  15917. &lt;p&gt;Most recently, Lee Byron, author of Immutable.js and other great things at
  15918. Facebook, joined us for a talk on Immutable.js, a library of persistent
  15919. immutable data structures that make Flux and React.js development even more
  15920. awesome. Lee Byron has been making things at Facebook since 2008 including
  15921. React, Immutable.js, and GraphQL.&lt;/p&gt;
  15922.  
  15923. &lt;p&gt;Check out the full video below:&lt;/p&gt;
  15924.  
  15925. &lt;iframe width=&quot;500&quot; height=&quot;281&quot; src=&quot;https://www.youtube.com/embed/kbnUIhsX2ds?feature=oembed&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;
  15926.  
  15927. &lt;p&gt;&lt;strong&gt;Interested in learning more about life in AdRoll Engineering? Check out our &lt;a href=&quot;https://www.adroll.com/about/careers&quot;&gt;Careers page&lt;/a&gt;!&lt;/strong&gt;&lt;/p&gt;
  15928. </description>
  15929.    </item>
  15930.    
  15931.    
  15932.    
  15933.    <item>
  15934.      <title>AWS Summit Keynote in San Francisco</title>
  15935.      <link>https://tech.nextroll.com/blog/adroll/2015/05/04/aws-summit-keynote.html</link>
  15936.      <pubDate>Mon, 04 May 2015 00:00:00 -0700</pubDate>
  15937.      <author></author>
  15938.      <guid isPermaLink="false">https://tech.nextroll.com/blog/adroll/2015/05/04/aws-summit-keynote</guid>
  15939.      <description>&lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/JRODD1_jBww?start=2894&amp;amp;end=3386&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;
  15940.  
  15941. &lt;p&gt;AdRoll is building products that allow customers of any size, big or
  15942. small, with a lot of marketing experience or none, to run high
  15943. performing marketing campaigns. And with over 20,000 customers in over
  15944. 150 countries, we’ve been fairly successful in this.&lt;/p&gt;
  15945.  
  15946. &lt;p&gt;Our main product has been Retargeting, the basic concept is very simple:
  15947. when one of your customers leaves your site without buying, we enable
  15948. you to reach them with the most appropriate messaging while they browse
  15949. other sites or use their phone. This is great because it’s easily
  15950. measurable and it performs as well as search.&lt;/p&gt;
  15951.  
  15952. &lt;p&gt;The part that is not easy in Retargeting is the complexity of the
  15953. infrastructure that lies behind it, which is called Real Time Bidding or
  15954. RTB. The basic requirement in RTB, is that we are latency bound. When we
  15955. talk about real time we mean it in the literal sense of a classic firm
  15956. real time system. Our machines receive about 60B requests each day and
  15957. they need to respond to each one of them in less than 100ms. 1% of
  15958. errors with 60B requests means 600M errors and that’s too much
  15959. opportunity cost.&lt;/p&gt;
  15960.  
  15961. &lt;p&gt;100ms max latency is a very challenging problem globally though. We have
  15962. customers from over 150 countries and light in fiber goes at 120k miles
  15963. per second. That’s a real problem because this ray of light actually
  15964. takes about 60ms to roundtrip between New York and Paris. That’s the
  15965. majority of the time we have available to bid on a single impression.&lt;/p&gt;
  15966.  
  15967. &lt;p&gt;If you want to operate in RTB, you need global presence from the
  15968. beginning. It would have been extremely capital intensive for us to go
  15969. across the many regions of the world that we needed to start data center
  15970. or bandwidth contracts, hire engineers and create entities in all of
  15971. those different nations. With our infrastructure in Amazon AWS we never
  15972. have to worry about that. We just opened our business in Japan and it
  15973. only takes us a few minutes to bootstrap our infrastructure and start
  15974. serving that region of the world with the required low latency.&lt;/p&gt;
  15975.  
  15976. &lt;p&gt;But we can’t just have our machines across the world, our data needs to
  15977. be there too and it needs to be able to be queried with the same low
  15978. latency.  Operating a massively scalable data storage to handle over 1M
  15979. requests per second is certainly an interesting task but not that
  15980. differentiating for our customers. We’d rather focus on product and our
  15981. algorithms.  So we built the parts of the infrastructure that we needed
  15982. and started using &lt;a href=&quot;http://aws.amazon.com/dynamodb/&quot;&gt;DynamoDB&lt;/a&gt; to store information. Today we have almost
  15983. 500B items stored in DynamoDB overall.  No engineers working on its
  15984. operation, except all of those inside Amazon, while at the same time we
  15985. see consistently low latencies of less than 5ms, throughout this growth
  15986. period.&lt;/p&gt;
  15987.  
  15988. &lt;p&gt;Just having data sitting statically however is not enough, it needs to
  15989. be analyzed and you need to be able to store a lot of it.  Moreover we
  15990. can only achieve our full potential if all of AdRoll teams are able to
  15991. access the data when they need it and quickly. All of our teams
  15992. coordinate using Amazon S3.  We’re now storing over 300TB of new
  15993. compressed data in S3 every month, over 17M new files, and in March 2015
  15994. we stored as much data as we did the first 4 months of 2014 combined.
  15995. This is our core asset and the more accessible it is the more value we
  15996. can extract from it, this is why we generate over 10B requests per month
  15997. on our buckets across all of our teams.&lt;/p&gt;
  15998.  
  15999. &lt;p&gt;We also like to play around with new services when they come out and
  16000. &lt;a href=&quot;http://aws.amazon.com/lambda/&quot;&gt;Lambda&lt;/a&gt; was no exception. Polling is not a scalable strategy to figure
  16001. out when new files are added to S3, especially when you add 17M of them
  16002. per month. When Lambda was released, a couple engineers quickly
  16003. prototyped a change in flow and moved Lambda in front of S3.  Very
  16004. quickly we were able to go in production with this change and now every
  16005. team receives notifications for the files they are interested in, as
  16006. soon as they are added, solving a major bottleneck.&lt;/p&gt;
  16007.  
  16008. &lt;hr /&gt;
  16009. </description>
  16010.    </item>
  16011.    
  16012.    
  16013.    
  16014.    <item>
  16015.      <title>Announcing Hologram, taking EC2 Instance Roles everywhere</title>
  16016.      <link>https://tech.nextroll.com/blog/ops/2014/12/22/announcing-hologram.html</link>
  16017.      <pubDate>Mon, 22 Dec 2014 00:00:00 -0800</pubDate>
  16018.      <author></author>
  16019.      <guid isPermaLink="false">https://tech.nextroll.com/blog/ops/2014/12/22/announcing-hologram</guid>
  16020.      <description>&lt;p&gt;We love Amazon Web Services here at AdRoll. All of our production, test, and staging infrastructure runs on AWS. Really, the only systems we operate outside of EC2 are developers’ laptops. Most of the time this isn’t an issue, as we use platform-agnostic tools like Python, Java and Erlang. When it came to interacting with AWS APIs, though, we had a challenge: how do we manage sensitive AWS Keys across dozens of developer’s laptops and many teams?&lt;/p&gt;
  16021.  
  16022. &lt;p&gt;Key management in EC2 is best handled using “IAM Roles” where a special endpoint in the Instance Metadata service (http://169.254.169.254/…) exposes temporary AWS API access credentials that have permissions defined by the instance’s Role, configured at launch time. This way applications can be designed that do not require checking secret keys into their repositories, reducing the chance of malicious key usage. This service only exists in EC2, though, so many developers who use AWS keep long-lived, highly privileged credentials in a file on their local development machine. This isn’t particularly secure, and it creates a difference between development and production that can lead to broken deployments.&lt;/p&gt;
  16023.  
  16024. &lt;p&gt;Several months ago, a couple of us embarked on a project to bridge this gap. Today, we’re proud to announce the release of Hologram, a system for bringing EC2 Role-like key management to non-EC2 hosts. Hologram exposes an imitation of the EC2 Instance Metadata service on developer workstations that exposes temporary credentials to your software the same way that EC2 does. It behaves just like EC2, so your code can use the same process in both development and production. The keys that Hologram provisions are temporary. EC2 access is then centrally controlled without direct administrative access to developer workstations.&lt;/p&gt;
  16025.  
  16026. &lt;p&gt;In the past, I’ve worked at companies that give up on local development and require developers to run all of their code, even when developing, on an EC2 instance. We could have gone down that path here at AdRoll, but then we would have lost the richness of local development: powerful IDE tooling, instant access to files, interactive graphical debugging, and no waiting for instances to come up. Instead, we’ve decided to make local development the best experience we can, and Hologram enables us to do that.&lt;/p&gt;
  16027.  
  16028. &lt;p&gt;Hologram is &lt;a href=&quot;https://github.com/AdRoll/hologram&quot;&gt;available on GitHub&lt;/a&gt; under the Apache License, Version 2.0. The current implementation depends on having an LDAP server in your organization and isn’t super easy to set up, but we plan to add support for simpler authentication backends and streamline installation over the coming months. Pull Requests are, of course, welcome!&lt;/p&gt;
  16029. </description>
  16030.    </item>
  16031.    
  16032.    
  16033.    
  16034.    <item>
  16035.      <title>AdRoll Puts the “P” in Big Data: Processing Petabytes</title>
  16036.      <link>https://tech.nextroll.com/blog/adtech/2014/11/24/adroll-puts-the-p-in-big-data.html</link>
  16037.      <pubDate>Mon, 24 Nov 2014 00:00:00 -0800</pubDate>
  16038.      <author></author>
  16039.      <guid isPermaLink="false">https://tech.nextroll.com/blog/adtech/2014/11/24/adroll-puts-the-p-in-big-data</guid>
  16040.      <description>&lt;p&gt;&lt;small class=&quot;muted&quot;&gt;&lt;em&gt;Originally published on the AdRoll Blog on &lt;a href=&quot;http://blog.adroll.com/adroll-puts-the-p-in-big-data-processing-petabytes&quot;&gt;November 12, 2014&lt;/a&gt;.&lt;/em&gt;&lt;/small&gt;&lt;/p&gt;
  16041.  
  16042. &lt;p&gt;The advertising industry has undeniably become a data play, as consumers are generating valuable data with every digital interaction. We hear buzzwords like “big data,” “machine learning” and “real-time algorithms,” but little about how these puzzle pieces fit together to help marketers achieve their business objectives. Over the last few years, the ad tech industry has lead the way in turning big data concepts into solutions that solve real business challenges.&lt;/p&gt;
  16043.  
  16044. &lt;p&gt;&lt;img src=&quot;/images/post_images/big-data-feature-img.png&quot; alt=&quot;Big Data Feature Image&quot; /&gt;&lt;/p&gt;
  16045.  
  16046. &lt;p&gt;In its most basic form, data science is the extraction of knowledge from data, and machine learning powers this process making programmatic buying possible. In performance advertising, the predictive power of your model grows as you increase the amount of data that is flowing through the system. And &lt;a href=&quot;http://venturebeat.com/2014/11/12/adroll-hits-gigantic-130-terabytes-of-ad-data-processed-daily-says-size-matters/&quot;&gt;AdRoll has a lot of data to work with&lt;/a&gt;.&lt;/p&gt;
  16047.  
  16048. &lt;p&gt;Customer intent data is the biggest competitive advantage at a company’s disposal if it can collect, analyze, and use that data in real time. A few months ago our systems generated and ingested 40-50 terabytes of data each day, but recently we’ve reached new heights in data volume at AdRoll, processing 130TB of data, about 30TB compressed, every single day. Basically, we operate at a data volume that is 150 times bigger than all of the US stock exchanges combined, by two orders of magnitude, and over 10 times the volume of events. In three days we generate as much data as the US stock exchanges generate in one year.&lt;/p&gt;
  16049.  
  16050. &lt;p&gt;In fact, this year we hit the petabyte mark, processing over 10 petabytes (1 PB = 1000000000000000 bytes) = 1000 terabytes), a 1,200% increase year over year. To put that in perspective, in order to accommodate the storage and processing capacity of 10PB, we would need a space at least the size of AT&amp;amp;T Park. Given San Francisco real estate prices, it’s a good thing we’ve been able to utilize globally distributed, cloud-based data warehouses thanks to our friends at &lt;a href=&quot;http://aws.amazon.com/&quot;&gt;Amazon AWS&lt;/a&gt;.&lt;/p&gt;
  16051.  
  16052. &lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/FekGOiOGB8Y&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;
  16053.  
  16054. &lt;p&gt;Our partnership with Amazon allows us to grow our inventory sources and securely store, process, and scale our data volume. We can keep our data science and engineering talent focused on product innovation without the time and management resources required by legacy infrastructure. This has allowed us to serve our customers better while providing a steady revenue stream for publishers, which subsidizes the free web consumers have come to expect.&lt;/p&gt;
  16055.  
  16056. &lt;p&gt;Consumers are spending more time online and on mobile devices, and we are incorporating more and more signals and digital interactions (time on site, quality of ad space, emails opened) into our RTB algorithm for better targeting and optimization. Our machines now process over 60 billion events on a daily basis.&lt;/p&gt;
  16057.  
  16058. &lt;p&gt;So what does this massive increase in data volume indicate for the industry as a whole? It’s simple-  predictive analytics are the future of RTB and data will drive the next generation of advertising.&lt;/p&gt;
  16059.  
  16060. &lt;p&gt;Retargeting has become a powerful tool, providing the masses with real-time predictive  capabilities that were once only reserved for the likes of Google. More inventory sources are popping up, more digital interactions are being captured, and more intelligent algorithms are being developed as a result.&lt;/p&gt;
  16061.  
  16062. &lt;p&gt;Retargeting was one of the first ways marketers could leverage the customer data available for collection on their websites and it continues to outpace the industry in terms of innovation in RTB. Big data has moved from a buzzword to a staple in advertising, and it’s importance and profitability will only grow.&lt;/p&gt;
  16063.  
  16064. &lt;p&gt;Who knows how many baseball fields of data we’ll be processing in 2015.&lt;/p&gt;
  16065. </description>
  16066.    </item>
  16067.    
  16068.    
  16069.    
  16070.    <item>
  16071.      <title>D is for Data Science</title>
  16072.      <link>https://tech.nextroll.com/blog/data/2014/11/17/d-is-for-data-science.html</link>
  16073.      <pubDate>Mon, 17 Nov 2014 00:00:00 -0800</pubDate>
  16074.      <author></author>
  16075.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2014/11/17/d-is-for-data-science</guid>
  16076.      <description>&lt;p&gt;&lt;a href=&quot;http://dlang.org&quot;&gt;The D programming language&lt;/a&gt; has quickly become our language of choice on the Data Science
  16077. team for any task that requires efficiency, and is now the keystone language for our critical infrastructure.
  16078. Why? Because D has a lot to offer.&lt;/p&gt;
  16079.  
  16080. &lt;hr /&gt;
  16081.  
  16082. &lt;h2 id=&quot;a-brief-introduction&quot;&gt;A Brief Introduction&lt;/h2&gt;
  16083.  
  16084. &lt;p&gt;One of the clearest advantages of using D compared to other typical data science workflows is that it compiles down
  16085. into machine code. Without an interpreter or virtual machine layer, we can rip through data significantly
  16086. faster than other tools like a Java hadoop framework, R, or python would allow. But D’s compiler is fast
  16087. enough that in many cases it can be run as if it were a scripting language. Let’s try comparing to python by
  16088. generating a million uniform random variates, sorting, and finding the deciles:&lt;/p&gt;
  16089.  
  16090. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;random&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;uniform&lt;/span&gt;
  16091.  
  16092. &lt;span class=&quot;n&quot;&gt;variates&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;uniform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
  16093. &lt;span class=&quot;n&quot;&gt;variates&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  16094. &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  16095.    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;variates&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16096.  
  16097. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;python deciles.py &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; some_deciles
  16098.  
  16099. real 0m0.857s
  16100. user 0m0.825s
  16101. sys 0m0.032s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16102.  
  16103. &lt;p&gt;And similarly for D:&lt;/p&gt;
  16104.  
  16105. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-d&quot; data-lang=&quot;d&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16106. &lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16107.  
  16108. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16109.    &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;variates&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1_000_000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  16110.    &lt;span class=&quot;k&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;variates&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16111.        &lt;span class=&quot;n&quot;&gt;variates&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;uniform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16112.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16113.    &lt;span class=&quot;n&quot;&gt;variates&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16114.    &lt;span class=&quot;k&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0..10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16115.        &lt;span class=&quot;n&quot;&gt;writeln&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;variates&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100_000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]);&lt;/span&gt;
  16116.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16117. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16118.  
  16119. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;rdmd deciles.d &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; some_deciles
  16120.  
  16121. real 0m0.929s
  16122. user 0m0.725s
  16123. sys 0m0.177s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16124.  
  16125. &lt;p&gt;Wait… what? It took longer, but that’s because I just ran it the once and it includes
  16126. the compilation time. If I don’t change anything and run it again:&lt;/p&gt;
  16127.  
  16128. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;rdmd deciles.d &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; some_deciles
  16129.  
  16130. real 0m0.293s
  16131. user 0m0.291s
  16132. sys 0m0.004s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16133.  
  16134. &lt;p&gt;That’s better. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rdmd&lt;/code&gt; won’t bother to recompile if there’s no code change. These savings
  16135. can add up quite significantly once your code becomes more computationally complex and you
  16136. need to perform many computations over a substantial amount of data.&lt;/p&gt;
  16137.  
  16138. &lt;p&gt;But of course there’s no real revelation here. If we want hyper-efficient code, we know
  16139. that it’s best to drop into a compiled language. The key thing here that separates D from
  16140. other efficient languages like the oft-suggested C or C++ is that D frees you to program
  16141. in the style you feel most comfortable with at the given time.&lt;/p&gt;
  16142.  
  16143. &lt;p&gt;The code above shows how we can write a quick “script” in D without incurring any additional
  16144. headache over the simpler python. But D is also just as clean if you want to start writing
  16145. some object-oriented code:&lt;/p&gt;
  16146.  
  16147. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-d&quot; data-lang=&quot;d&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16148. &lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16149. &lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16150.  
  16151. &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rectangle&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16152.    &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16153.    &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16154.  
  16155.    &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16156.        &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;width&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16157.        &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;height&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16158.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16159.  
  16160.    &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;area&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16161.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;width&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16162.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16163. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16164.  
  16165. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16166.    &lt;span class=&quot;n&quot;&gt;Rectangle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rs&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rectangle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1_000_000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  16167.    &lt;span class=&quot;n&quot;&gt;rs&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;!(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rectangle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;uniform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  16168.                                    &lt;span class=&quot;n&quot;&gt;uniform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16169.    &lt;span class=&quot;n&quot;&gt;rs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;!((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;width&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16170.    &lt;span class=&quot;k&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0..10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16171.        &lt;span class=&quot;n&quot;&gt;writefln&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%0.4f\t%0.4f\t%0.4f&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  16172.                 &lt;span class=&quot;n&quot;&gt;rs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100_000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  16173.                 &lt;span class=&quot;n&quot;&gt;rs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100_000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  16174.                 &lt;span class=&quot;n&quot;&gt;rs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100_000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;area&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
  16175.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16176. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16177.  
  16178. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; rdmd rectangle_deciles.d
  16179. 0.0000 0.9382 0.0000
  16180. 0.1000 0.2020 0.0202
  16181. 0.1996 0.3612 0.0721
  16182. 0.2995 0.3947 0.1182
  16183. 0.3994 0.7440 0.2972
  16184. 0.4993 0.8733 0.4360
  16185. 0.5997 0.0221 0.0132
  16186. 0.6997 0.6624 0.4634
  16187. 0.7997 0.6204 0.4961
  16188. 0.9003 0.4640 0.4177&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16189.  
  16190. &lt;p&gt;And D is as ready as you are for your high-performance needs. For example, if we want to calculate a fast
  16191. &lt;a href=&quot;http://en.wikipedia.org/wiki/Fast_inverse_square_root&quot;&gt;inverse square root&lt;/a&gt;, we can use some pointer
  16192. voodoo (very lightly modified from the linked Wikipedia article):&lt;/p&gt;
  16193.  
  16194. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-d&quot; data-lang=&quot;d&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;conv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16195. &lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16196.  
  16197. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16198.    &lt;span class=&quot;c1&quot;&gt;//computes 1/sqrt(x)&lt;/span&gt;
  16199.    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;float&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]);&lt;/span&gt;
  16200.    &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16201.    &lt;span class=&quot;kt&quot;&gt;float&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16202.    &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;float&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;threehalves&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.5f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16203.  
  16204.    &lt;span class=&quot;n&quot;&gt;x2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16205.    &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16206.    &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;cast&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;*)&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16207.    &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x5f3759df&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16208.    &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;cast&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;float&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;*)&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16209.    &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;threehalves&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  16210.  
  16211.    &lt;span class=&quot;n&quot;&gt;writeln&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16212. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16213.  
  16214. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; rdmd invsqrt.d 1
  16215. 0.998307
  16216. %&amp;gt; rdmd invsqrt.d 2
  16217. 0.70693
  16218. %&amp;gt; rdmd invsqrt.d 4
  16219. 0.499154
  16220. %&amp;gt; rdmd invsqrt.d 16
  16221. 0.249577&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16222.  
  16223. &lt;p&gt;D will even let you write &lt;a href=&quot;http://dlang.org/iasm.html&quot;&gt;inline assembly&lt;/a&gt; if you
  16224. really want to squeeze the most performance out of it. But this is all just fun
  16225. and games. How can D help in a real-world scenario?&lt;/p&gt;
  16226.  
  16227. &lt;hr /&gt;
  16228.  
  16229. &lt;h2 id=&quot;ripping-through-data&quot;&gt;Ripping Through Data&lt;/h2&gt;
  16230.  
  16231. &lt;p&gt;During the course of our work at AdRoll, we had some infrastructure in D that was
  16232. running fine for a while, but at a certain point, when our data problems exceeded
  16233. the scope this code was designed for, we had to optimize. And, believe it or not,
  16234. the problem was as banal as pulling some fields out of delimited files. This is
  16235. the gist of what we did.&lt;/p&gt;
  16236.  
  16237. &lt;p&gt;This particular log file contains some ad data delimited by the ASCII record
  16238. separator. Let’s say we want to pull out some timestamps and the country whence
  16239. these data come. As you can probably imagine by now, the naïve D solution is
  16240. quite readable, but is perhaps not as snappy as we’d like:&lt;/p&gt;
  16241.  
  16242. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-d&quot; data-lang=&quot;d&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16243. &lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16244.  
  16245. &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;immutable&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TIMESTAMP_INDEX&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16246. &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;immutable&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;COUNTRY_INDEX&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;42&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16247. &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;immutable&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DELIMITER&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;cast&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16248.  
  16249. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16250.    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;File&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;r&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16251.    &lt;span class=&quot;k&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;byLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16252.        &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[][]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DELIMITER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16253.        &lt;span class=&quot;n&quot;&gt;writefln&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%s\t%s&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  16254.                 &lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TIMESTAMP_INDEX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
  16255.                 &lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;COUNTRY_INDEX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]);&lt;/span&gt;
  16256.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16257.    &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16258. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16259.  
  16260. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;rdmd parser.d log.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; country_info
  16261.  
  16262. real 0m13.421s
  16263. user 0m13.270s
  16264. sys 0m0.153s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16265.  
  16266. &lt;p&gt;&lt;strong&gt;Thirteen seconds?&lt;/strong&gt; That’s an eternity! This one log file is but a mere sample
  16267. of what we’re producing at AdRoll—uncompressed, we generate about &lt;a href=&quot;http://venturebeat.com/2014/11/12/adroll-hits-gigantic-130-terabytes-of-ad-data-processed-daily-says-size-matters/&quot;&gt;130TB of
  16268. log files each day&lt;/a&gt;.
  16269. Ok, we could potentially distribute the problem, but spinning up clusters takes
  16270. time, and one of our mottos is to “do more with less.” How can we improve the
  16271. performance here to keep our scalability down?&lt;/p&gt;
  16272.  
  16273. &lt;p&gt;One of the best things we can do is minimize the amount of memory we’re
  16274. allocating; we allocate a new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;char[]&lt;/code&gt; every time we read a line. But even
  16275. beyond that, we read through the line to put into this &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;char[]&lt;/code&gt;, and read
  16276. through it again to split it by our record separator. But this splitting
  16277. creates an array of all of the fields in the line, and allocates memory for
  16278. each and every one of them. From an efficiency standpoint, this clean,
  16279. straightforward code is actually quite a mess.&lt;/p&gt;
  16280.  
  16281. &lt;p&gt;To address our first concern with memory allocation, we can have a buffer
  16282. that’s already allocated to read into. The trick here is that the next line
  16283. will immediately follow the previous line in the same buffer, and the end of
  16284. it may not fit into the buffer. Worse, we may only get a partial bit of a
  16285. field that we care about.&lt;/p&gt;
  16286.  
  16287. &lt;p&gt;The solution is to have two buffers which we swap between. The fields we care
  16288. about may straddle both buffers, so we’ll also construct a double-length buffer
  16289. for simple reconstruction. When we reach the end of the current buffer, we’ll
  16290. make it the “last buffer,” and then load more data into the other buffer,
  16291. promoting it to the current buffer. The caveat is that we need to make sure
  16292. that none of our lines has a length longer than both of these buffers combined.&lt;/p&gt;
  16293.  
  16294. &lt;p&gt;Our second concern had to do with the inefficiency of splitting. Instead of
  16295. breaking apart the line in totality, once it’s in memory, let’s just
  16296. sequentially read through it. We’ll keep track of our progress through both
  16297. the buffer (index of the array) and our current line (number of delimiters
  16298. we’ve seen). Once we hit the right number of delimiters, we just need to find
  16299. the next delimiter, and we know our field is the contents in between. If we
  16300. hit the end of a line by reading a newline, we’ll reset our line progress.&lt;/p&gt;
  16301.  
  16302. &lt;p&gt;Finally, once we’ve collected all our fields, we just need to rip through
  16303. the buffer until we find the next newline.&lt;/p&gt;
  16304.  
  16305. &lt;p&gt;Enough chit-chat; let’s look at some code:&lt;/p&gt;
  16306.  
  16307. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-d&quot; data-lang=&quot;d&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16308.  
  16309. &lt;span class=&quot;k&quot;&gt;immutable&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;READ_BUFFER_SIZE&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;32768&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16310. &lt;span class=&quot;k&quot;&gt;immutable&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DELIMITER&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;cast&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16311. &lt;span class=&quot;k&quot;&gt;immutable&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NEWLINE&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;cast&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16312.  
  16313. &lt;span class=&quot;k&quot;&gt;immutable&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TIMESTAMP_INDEX&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16314. &lt;span class=&quot;k&quot;&gt;immutable&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;COUNTRY_INDEX&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;42&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16315.  
  16316. &lt;span class=&quot;k&quot;&gt;immutable&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;UNKNOWN_FIELD&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;unknown&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16317.  
  16318. &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FastReader&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16319.    &lt;span class=&quot;n&quot;&gt;File&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16320.  
  16321.    &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bufferA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16322.    &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bufferB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16323.    &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;double_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16324.    &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16325.    &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;last_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16326.    &lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16327.  
  16328.    &lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16329.    &lt;span class=&quot;kt&quot;&gt;uint&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_del&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16330.    &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16331.  
  16332.    &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16333.        &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;File&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;r&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16334.        &lt;span class=&quot;n&quot;&gt;bufferA&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;READ_BUFFER_SIZE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  16335.        &lt;span class=&quot;n&quot;&gt;bufferB&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;READ_BUFFER_SIZE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  16336.        &lt;span class=&quot;n&quot;&gt;double_buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;READ_BUFFER_SIZE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  16337.        &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bufferA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16338.        &lt;span class=&quot;n&quot;&gt;last_buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bufferB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16339.        &lt;span class=&quot;n&quot;&gt;num_buf&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16340.  
  16341.        &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16342.        &lt;span class=&quot;n&quot;&gt;num_del&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16343.        &lt;span class=&quot;n&quot;&gt;line_end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16344.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16345.  
  16346.    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reset_progress&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16347.        &lt;span class=&quot;n&quot;&gt;num_del&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16348.        &lt;span class=&quot;n&quot;&gt;line_end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16349.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16350.  
  16351.    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;swap_and_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16352.        &lt;span class=&quot;n&quot;&gt;last_buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16353.        &lt;span class=&quot;n&quot;&gt;num_buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;++;&lt;/span&gt;
  16354.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_buf&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  16355.            &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bufferA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16356.        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
  16357.            &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bufferB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16358.        &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rawRead&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16359.        &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16360.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16361.  
  16362.    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;uint&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;ref&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16363.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16364.            &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;UNKNOWN_FIELD&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16365.            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16366.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16367.        &lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16368.        &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_del&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16369.            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16370.                &lt;span class=&quot;n&quot;&gt;swap_and_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16371.                &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16372.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16373.            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NEWLINE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16374.                &lt;span class=&quot;n&quot;&gt;line_end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16375.                &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16376.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16377.            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DELIMITER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  16378.                &lt;span class=&quot;n&quot;&gt;num_del&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;++;&lt;/span&gt;
  16379.            &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;++;&lt;/span&gt;
  16380.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16381.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16382.            &lt;span class=&quot;n&quot;&gt;swap_and_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16383.            &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16384.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16385.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16386.            &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;UNKNOWN_FIELD&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16387.            &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16388.            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16389.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16390.        &lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16391.        &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;swapped&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16392.        &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DELIMITER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16393.            &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;++;&lt;/span&gt;
  16394.            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16395.                &lt;span class=&quot;n&quot;&gt;swap_and_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16396.                &lt;span class=&quot;n&quot;&gt;swapped&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16397.                &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16398.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16399.            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NEWLINE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16400.                &lt;span class=&quot;n&quot;&gt;line_end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16401.                &lt;span class=&quot;n&quot;&gt;num_del&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;--;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;//Don&apos;t count as delimiter&lt;/span&gt;
  16402.                &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16403.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16404.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16405.        &lt;span class=&quot;n&quot;&gt;num_del&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;++;&lt;/span&gt;
  16406.  
  16407.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;swapped&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  16408.            &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;..&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  16409.        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16410.            &lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size_end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;last_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16411.            &lt;span class=&quot;n&quot;&gt;double_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;..&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;last_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;..&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;$];&lt;/span&gt;
  16412.            &lt;span class=&quot;n&quot;&gt;double_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size_end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;..&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size_end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;
  16413.                &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;..&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  16414.            &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;double_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;..&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size_end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)];&lt;/span&gt;
  16415.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16416.        &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16417.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16418.  
  16419.    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;advance_to_next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16420.        &lt;span class=&quot;kt&quot;&gt;ulong&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16421.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16422.            &lt;span class=&quot;n&quot;&gt;swap_and_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16423.            &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16424.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16425.        &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NEWLINE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16426.            &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;++;&lt;/span&gt;
  16427.            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16428.                &lt;span class=&quot;n&quot;&gt;swap_and_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16429.                &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16430.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16431.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16432.        &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16433.        &lt;span class=&quot;n&quot;&gt;reset_progress&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16434.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16435.  
  16436.    &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eof&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16437.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eof&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16438.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16439.  
  16440.    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16441.        &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16442.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16443. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16444.  
  16445. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;process_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16446.    &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16447.    &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;country&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16448.    &lt;span class=&quot;n&quot;&gt;FastReader&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;frd&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FastReader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16449.    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eof&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16450.        &lt;span class=&quot;n&quot;&gt;frd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TIMESTAMP_INDEX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16451.        &lt;span class=&quot;n&quot;&gt;frd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;COUNTRY_INDEX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;country&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16452.        &lt;span class=&quot;n&quot;&gt;frd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;advance_to_next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16453.        &lt;span class=&quot;n&quot;&gt;writefln&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%s\t%s&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;country&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16454.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16455.    &lt;span class=&quot;n&quot;&gt;frd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  16456. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16457.  
  16458. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16459.    &lt;span class=&quot;n&quot;&gt;process_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]);&lt;/span&gt;
  16460. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16461.  
  16462. &lt;p&gt;This really beefs up our simple program from before to accomplish the same task.
  16463. Let’s break it down a bit. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bufferA&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bufferB&lt;/code&gt; are the two backing buffers,
  16464. and they will be pointed to by either &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;current_buffer&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;last_buffer&lt;/code&gt;,
  16465. depending on our  state, which we track with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;num_buf&lt;/code&gt;. To keep track of our progress
  16466. through &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;current_buffer&lt;/code&gt; we use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;index&lt;/code&gt;, and to keep track of our progress through
  16467. a line we use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;num_del&lt;/code&gt; to represent the number of  delimiters we’ve seen thus far.
  16468. Finally, we want to know if we’ve hit the end of a line with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;line_end&lt;/code&gt;.&lt;/p&gt;
  16469.  
  16470. &lt;p&gt;The instantiation of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FastReader&lt;/code&gt; is straightforward: we start with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bufferA&lt;/code&gt;.
  16471. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reset_progress()&lt;/code&gt; gets called when we hit the start of a new line and it just
  16472. updates the state to reflect that. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;swap_and_load()&lt;/code&gt; is our toggling method. Note
  16473. that the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;num_buf&lt;/code&gt; operation to determine our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;current_buffer&lt;/code&gt; is equivalent to
  16474. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;num_buf % 2&lt;/code&gt;.&lt;/p&gt;
  16475.  
  16476. &lt;p&gt;There’s also a quick catch here with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;file.rawRead()&lt;/code&gt; method: though it
  16477. takes a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;char[]&lt;/code&gt; in and populates it from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;File&lt;/code&gt;, if we hit an end of file,
  16478. the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;char[]&lt;/code&gt; will have its previous contents past the EOF. For example:&lt;/p&gt;
  16479.  
  16480. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-d&quot; data-lang=&quot;d&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16481.  
  16482. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16483.    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;File&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;test.txt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;r&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16484.    &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;0123456789&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16485.  
  16486.    &lt;span class=&quot;n&quot;&gt;writeln&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;buffer length:\t&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16487.    &lt;span class=&quot;n&quot;&gt;writeln&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;buffer content:\t&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16488.  
  16489.    &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rawRead&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16490.  
  16491.    &lt;span class=&quot;n&quot;&gt;writeln&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;buffer length:\t&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16492.    &lt;span class=&quot;n&quot;&gt;writeln&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;buffer content:\t&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16493. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16494.  
  16495. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;test&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; test.txt
  16496. %&amp;gt; rdmd test.d
  16497. buffer length: 10
  16498. buffer content: 0123456789
  16499. buffer length: 10
  16500. buffer content: test456789&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16501.  
  16502. &lt;p&gt;For our purposes, this is an undesirable behavior because we want to know
  16503. when we’ve actually hit the end of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;File&lt;/code&gt; and the end of the last line.
  16504. It turns out that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;file.rawRead()&lt;/code&gt; also returns a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;char[]&lt;/code&gt; which is the slice
  16505. of the passed in array that has new content. Just changing the line to
  16506. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;buffer = file.rawRead(buffer)&lt;/code&gt; fixes the issue:&lt;/p&gt;
  16507.  
  16508. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; rdmd test.d
  16509. buffer length: 10
  16510. buffer content: 0123456789
  16511. buffer length: 4
  16512. buffer content: &lt;span class=&quot;nb&quot;&gt;test&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16513.  
  16514. &lt;p&gt;We use the same construction in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;swap_and_load()&lt;/code&gt;, finally resetting our
  16515. progress through &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;current_buffer&lt;/code&gt; to 0.&lt;/p&gt;
  16516.  
  16517. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;get_field()&lt;/code&gt; is the trickiest method, but still intelligible. If
  16518. we’ve already hit the end of the line, there’s no field to find, so we
  16519. write out that it’s unknown. Starting from where we are in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;current_buffer&lt;/code&gt;,
  16520. we just start counting delimiters. If we hit the end of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;current_buffer&lt;/code&gt;, it’s
  16521. time to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;swap_and_load()&lt;/code&gt;. Once we’ve hit the correct number of delimiters,
  16522. we need to find the next one. This is essentially the same code, but we need
  16523. to know if we swap buffers during this process. If we hit a newline, we count
  16524. that as the end of the field.&lt;/p&gt;
  16525.  
  16526. &lt;p&gt;Constructing the field is simple: if we didn’t swap, it’s just the slice of
  16527. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;current_buffer&lt;/code&gt; from our start to end indices. Otherwise, we piece it
  16528. together from both &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;last_buffer&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;current_buffer&lt;/code&gt;.&lt;/p&gt;
  16529.  
  16530. &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;advance_to_next()&lt;/code&gt; method also works in a similar way to searching out
  16531. delimiters, but instead we look for newline characters, move into the next
  16532. line, and then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reset_progres()&lt;/code&gt;.&lt;/p&gt;
  16533.  
  16534. &lt;p&gt;There are a couple of catches at this point. First, there’s our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;eof()&lt;/code&gt; method.
  16535. It’s possible that we hit EOF but there are more lines to read. We
  16536. check for this by ensuring that we’re only truly done when our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;index&lt;/code&gt; is
  16537. the same as our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;current_buffer&lt;/code&gt; length. Finally, in our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;process_file()&lt;/code&gt; call,
  16538. we make sure that we move on to the next line after getting all the fields we
  16539. require. Critically, we search for the fields &lt;em&gt;in their numerical order&lt;/em&gt;.
  16540. We need to do this because we run through our lines in sequence and never
  16541. look back.&lt;/p&gt;
  16542.  
  16543. &lt;p&gt;Ok, so our code gained some weight. But fortunately, it gained weight in
  16544. terms of muscle mass instead of bloat. Check it out:&lt;/p&gt;
  16545.  
  16546. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;rdmd parser_fast.d log.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; country_info_fast
  16547.  
  16548. real   0m2.159s
  16549. user   0m2.093s
  16550. sys    0m0.068s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16551.  
  16552. &lt;p&gt;Hey, that’s better than a 6x improvement! We can now run over way more
  16553. data in the same amount of time. Just to make sure nothing fishy is going
  16554. on:&lt;/p&gt;
  16555.  
  16556. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; diff country_info country_info_fast
  16557. %&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16558.  
  16559. &lt;p&gt;Awesome. But hey, y’know what? It would be really cool if we could exploit
  16560. the fact that I’m running on a multi-core machine and read in multiple files
  16561. at once. How would we do that?&lt;/p&gt;
  16562.  
  16563. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-d&quot; data-lang=&quot;d&quot;&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parallelism&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16564. &lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16565. &lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  16566.  
  16567. &lt;span class=&quot;cm&quot;&gt;/* Then it&apos;s all the same until main()... */&lt;/span&gt;
  16568.  
  16569. &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16570.    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;files&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;,&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16571.    &lt;span class=&quot;k&quot;&gt;foreach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parallel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  16572.        &lt;span class=&quot;n&quot;&gt;process_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  16573.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  16574. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16575.  
  16576. &lt;p&gt;Well, that was pretty easy. What type of performance do we get?&lt;/p&gt;
  16577.  
  16578. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;i &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;2 3 4&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cp &lt;/span&gt;log.txt log&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;.txt&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;
  16579. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;rdmd parser_parallel.d log.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; more_country_info
  16580.  
  16581. real 0m2.219s
  16582. user 0m2.153s
  16583. sys 0m0.070s
  16584. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;rdmd parser_parallel.d log.txt,log2.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; more_country_info
  16585.  
  16586. real 0m2.308s
  16587. user 0m4.340s
  16588. sys 0m0.228s
  16589. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;rdmd parser_parallel.d log.txt,log2.txt,log3.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; more_country_info
  16590.  
  16591. real 0m2.458s
  16592. user 0m6.821s
  16593. sys 0m0.354s
  16594. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;rdmd parser_parallel.d log.txt,log2.txt,log3.txt,log4.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; more_country_info
  16595.  
  16596. real 0m2.634s
  16597. user 0m9.213s
  16598. sys 0m0.781s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16599.  
  16600. &lt;p&gt;Wow! That’s over five time faster than our original solution while running over
  16601. four times the data, for a 20x performance boost! And we’re not done yet…&lt;/p&gt;
  16602.  
  16603. &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rdmd&lt;/code&gt; doesn’t perform as many optimizations as it could. Once our code is in
  16604. the state that we want it, it makes sense to compile with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dmd&lt;/code&gt; rather than
  16605. running it under a scripting idiom. To be fair, we’ll also compile down our
  16606. original naïve solution and our non-parallelized one:&lt;/p&gt;
  16607.  
  16608. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-shell&quot; data-lang=&quot;shell&quot;&gt;%&amp;gt; dmd &lt;span class=&quot;nt&quot;&gt;-O&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-release&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-inline&lt;/span&gt; parser.d
  16609. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt; ./parser log.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; country_info
  16610.  
  16611. real 0m8.352s
  16612. user 0m8.224s
  16613. sys 0m0.128s
  16614. %&amp;gt; dmd &lt;span class=&quot;nt&quot;&gt;-O&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-release&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-inline&lt;/span&gt; parser_fast.d
  16615. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt; ./parser_fast log.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; country_info_fast
  16616.  
  16617. real 0m0.697s
  16618. user 0m0.637s
  16619. sys 0m0.060s
  16620. %&amp;gt; dmd &lt;span class=&quot;nt&quot;&gt;-O&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-release&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-inline&lt;/span&gt; parser_parallel.d
  16621. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt; ./parser_parallel log.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; more_country_info
  16622. real 0m0.950s
  16623. user 0m0.759s
  16624. sys 0m0.181s
  16625. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt; ./parser_parallel log.txt,log2.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; more_country_info
  16626.  
  16627. real 0m0.857s
  16628. user 0m1.467s
  16629. sys 0m0.196s
  16630. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt; ./parser_parallel log.txt,log2.txt,log3.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; more_country_info
  16631.  
  16632. real 0m1.127s
  16633. user 0m2.469s
  16634. sys 0m0.617s
  16635. %&amp;gt; &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt; ./parser_parallel log.txt,log2.txt,log3.txt,log4.txt &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; more_country_info
  16636.  
  16637. real 0m1.527s
  16638. user 0m3.792s
  16639. sys 0m1.264s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16640.  
  16641. &lt;p&gt;After compiling everything down, our fast solution has nearly a 12x performance
  16642. boost over the naïve one, and our parallelized solution, when run over four times
  16643. the data, has nearly a 22x boost.&lt;/p&gt;
  16644.  
  16645. &lt;p&gt;At AdRoll Data Science, we’ve become big fans of D, and it’s easy to see why.
  16646. We can rapidly prototype new infrastructure and analysis tasks, and when
  16647. efficiency becomes a core concern, we have the ability to refactor that same
  16648. code base to squeeze as much performance out as possible. If you’re interested
  16649. in tackling big data problems with an eye for lean, mean code, you should
  16650. &lt;a href=&quot;http://www.adroll.com/about/careers&quot;&gt;let us know&lt;/a&gt;.&lt;/p&gt;
  16651. </description>
  16652.    </item>
  16653.    
  16654.    
  16655.    
  16656.    <item>
  16657.      <title>Shiny at AdRoll</title>
  16658.      <link>https://tech.nextroll.com/blog/external/2014/10/29/shiny.html</link>
  16659.      <pubDate>Wed, 29 Oct 2014 00:00:00 -0700</pubDate>
  16660.      <author></author>
  16661.      <guid isPermaLink="false">https://tech.nextroll.com/blog/external/2014/10/29/shiny</guid>
  16662.      <description>&lt;p&gt;Read the post here: &lt;a href=&quot;http://glossy.adroll.com/shiny_blogpost/&quot;&gt;http://glossy.adroll.com/shiny_blogpost/&lt;/a&gt;&lt;/p&gt;
  16663. </description>
  16664.    </item>
  16665.    
  16666.    
  16667.    
  16668.    <item>
  16669.      <title>Introducing AirControl: AirPlay Mirroring from your terminal</title>
  16670.      <link>https://tech.nextroll.com/blog/terminal/2014/09/26/introducing-aircontrol-control-airplay-through-terminal.html</link>
  16671.      <pubDate>Fri, 26 Sep 2014 00:00:00 -0700</pubDate>
  16672.      <author></author>
  16673.      <guid isPermaLink="false">https://tech.nextroll.com/blog/terminal/2014/09/26/introducing-aircontrol-control-airplay-through-terminal</guid>
  16674.      <description>&lt;p&gt;AirControl is a tool to control AirPlay Mirroring from the command line.&lt;/p&gt;
  16675.  
  16676. &lt;p&gt;TL;DR: it’s open source and available in our &lt;a href=&quot;https://github.com/AdRoll/AirControl&quot;&gt;GitHub
  16677. repo&lt;/a&gt;.&lt;/p&gt;
  16678.  
  16679. &lt;p&gt;If you’re interested in the details of how we got it working, read on.&lt;/p&gt;
  16680.  
  16681. &lt;hr /&gt;
  16682.  
  16683. &lt;p&gt;Once a month, the Engineering team has a Hack Day. Anybody can work on anything
  16684. they like, be it discovering a new programming language, speeding up our test
  16685. suite, or, in the case at hand, building a terminal tool to control AirPlay
  16686. Mirroring.&lt;/p&gt;
  16687.  
  16688. &lt;p&gt;The only constraint we put on these Hack Days is to work with other people on
  16689. your project.&lt;/p&gt;
  16690.  
  16691. &lt;hr /&gt;
  16692.  
  16693. &lt;p&gt;I use vim, Vimium, have custom mappings on my keyboard, wrote a Chrome extension
  16694. to log you out of websites with a keyboard shortcut: in short, I like typing
  16695. more than moving my mouse and clicking around. That’s why a couple of months
  16696. ago I realized that having to use the mouse to activate AirPlay Mirroring was
  16697. inconvenient… nay! Barbaric! Something needed to be done.&lt;/p&gt;
  16698.  
  16699. &lt;p&gt;I recruited &lt;a href=&quot;http://hackerengineer.net/&quot;&gt;Eric&lt;/a&gt; and
  16700. &lt;a href=&quot;http://www.math.missouri.edu/~evanslc/&quot;&gt;Chris&lt;/a&gt; to work on this last Friday.&lt;/p&gt;
  16701.  
  16702. &lt;p&gt;We worked through the problem in stages:&lt;/p&gt;
  16703.  
  16704. &lt;h1 id=&quot;activate-airplay-mirroring-programmatically&quot;&gt;Activate AirPlay Mirroring programmatically&lt;/h1&gt;
  16705.  
  16706. &lt;p&gt;First, we got a short AppleScript to activate the AirPlay menu in the menu bar:&lt;/p&gt;
  16707.  
  16708. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-applescript&quot; data-lang=&quot;applescript&quot;&gt;&lt;span class=&quot;k&quot;&gt;tell&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;application&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;System Events&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  16709.    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;tell&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;SystemUIServer&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  16710.        &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;click&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;menu&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;bar&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;na&quot;&gt;menu&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;bar&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;whose&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;description&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;ow&quot;&gt;contains&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Displays&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  16711.        &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;click&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;na&quot;&gt;menu&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;na&quot;&gt;menu&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  16712.    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;tell&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  16713. &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;tell&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16714.  
  16715. &lt;p&gt;You can open AppleScript Editor, paste this and run it, if you have an AirPlay
  16716. device on your network, it should connect you.&lt;/p&gt;
  16717.  
  16718. &lt;p&gt;Changing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;item 4&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;item &quot;Living-room&quot;&lt;/code&gt; works too.&lt;/p&gt;
  16719.  
  16720. &lt;p&gt;Now we needed to have this AppleScript run from the terminal and take an
  16721. argument.&lt;/p&gt;
  16722.  
  16723. &lt;h1 id=&quot;to-the-terminal&quot;&gt;To the terminal!&lt;/h1&gt;
  16724.  
  16725. &lt;p&gt;The easiest way for that is to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;osascript&lt;/code&gt;. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;osascript&lt;/code&gt; lets you run any
  16726. &lt;a href=&quot;https://developer.apple.com/library/mac/documentation/applescript/conceptual/applescriptx/concepts/osa.html&quot;&gt;Open Scripting
  16727. Architecture&lt;/a&gt;
  16728. script and in particular AppleScript.&lt;/p&gt;
  16729.  
  16730. &lt;p&gt;So, instead of using AppleScript directly, we let &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bash&lt;/code&gt; invoke &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;osascript&lt;/code&gt; with
  16731. the above script filled in with the first argument passed in for AirPlay device
  16732. name:&lt;/p&gt;
  16733.  
  16734. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;#!/usr/bin/env bash&lt;/span&gt;
  16735.  
  16736. &lt;span class=&quot;nv&quot;&gt;tvname&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&quot;&lt;/span&gt;
  16737.  
  16738. &lt;span class=&quot;nb&quot;&gt;read&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;&apos;&lt;/span&gt; APPLESCRIPT &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOF&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;
  16739. tell application &quot;System Events&quot;
  16740.    tell process &quot;SystemUIServer&quot;
  16741.        click (menu bar item 1 of menu bar 1 whose description contains &quot;Displays&quot;)
  16742.        click menu item &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$tvname&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt; of menu 1 of result
  16743.    end tell
  16744. end tell
  16745. &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EOF
  16746.  
  16747. &lt;/span&gt;osascript &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$APPLESCRIPT&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; /dev/null&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16748.  
  16749. &lt;p&gt;You can run the above as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;aircontrol 2nd\ Floor\ Lounge&lt;/code&gt; and you’ll connect to
  16750. the AppleTV with that name.&lt;/p&gt;
  16751.  
  16752. &lt;p&gt;NB: you’ll need to enable your terminal for Accessibility settings as shown below:
  16753. &lt;img src=&quot;/images/post_images/aircontrol-accessibility.png&quot; alt=&quot;accessibility settings&quot; /&gt;&lt;/p&gt;
  16754.  
  16755. &lt;h1 id=&quot;picking-up-the-tab&quot;&gt;Picking up the tab&lt;/h1&gt;
  16756.  
  16757. &lt;p&gt;At this point, we have something that does what we wanted: starting AirPlay
  16758. Mirroring from the terminal. But we want more!  We &lt;s&gt;want&lt;/s&gt; need  to
  16759. tab-complete this thing!&lt;/p&gt;
  16760.  
  16761. &lt;h2 id=&quot;dns-discovery&quot;&gt;DNS Discovery&lt;/h2&gt;
  16762. &lt;p&gt;The first thing is to find the available device names. I had previously gathered
  16763. &lt;a href=&quot;https://github.com/pstadler/non-terminating-bash-processes&quot;&gt;some&lt;/a&gt;
  16764. &lt;a href=&quot;http://nto.github.io/AirPlay.html&quot;&gt;references&lt;/a&gt; that led us towards &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dns-sd&lt;/code&gt; to
  16765. find out what AirPlay devices are available.&lt;/p&gt;
  16766.  
  16767. &lt;p&gt;Namely &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dns-sd -B _aiplay._tcp&lt;/code&gt; will print out all the AirPlay-enabled devices
  16768. on the network:&lt;/p&gt;
  16769.  
  16770. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;dns-sd &lt;span class=&quot;nt&quot;&gt;-B&lt;/span&gt; _airplay._tcp
  16771. Browsing &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;_airplay._tcp
  16772. DATE: &lt;span class=&quot;nt&quot;&gt;---Fri&lt;/span&gt; 26 Sep 2014---
  16773. 11:09:47.758  ...STARTING...
  16774. Timestamp     A/R  Flags  &lt;span class=&quot;k&quot;&gt;if &lt;/span&gt;Domain   Service Type    Instance Name
  16775. 11:09:47.759  Add      3   4 local.   _airplay._tcp.  Honor Room &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;right&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
  16776. 11:09:47.759  Add      3   4 local.   _airplay._tcp.  Good Times Room
  16777. 11:09:47.759  Add      3   4 local.   _airplay._tcp.  Dev
  16778. 11:09:47.759  Add      3   4 local.   _airplay._tcp.  Unagi Room
  16779. ...&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16780.  
  16781. &lt;p&gt;Using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cut&lt;/code&gt; would trim that down to the useful info.&lt;/p&gt;
  16782.  
  16783. &lt;p&gt;(and make sure you check out &lt;a href=&quot;https://github.com/pstadler/non-terminating-bash-processes&quot;&gt;this
  16784. post&lt;/a&gt; on
  16785. non-terminating bash processes)&lt;/p&gt;
  16786.  
  16787. &lt;h2 id=&quot;you-complete-me&quot;&gt;You complete me&lt;/h2&gt;
  16788. &lt;p&gt;With that list in hand, we needed to tell &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bash&lt;/code&gt; how to tab-complete the
  16789. command.&lt;/p&gt;
  16790.  
  16791. &lt;p&gt;The short story is that you run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;complete -o nospace -F _aircontrol aircontrol&lt;/code&gt;
  16792. where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_aircontrol&lt;/code&gt; is a function that builds an array called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;COMPREPLY&lt;/code&gt; with
  16793. the completion candidates.&lt;/p&gt;
  16794.  
  16795. &lt;p&gt;Here’s our whole script to activate the tab-completion:&lt;/p&gt;
  16796.  
  16797. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;_aircontrol&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
  16798. &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  16799.    &lt;span class=&quot;nb&quot;&gt;local &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;cur&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;COMP_WORDS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[COMP_CWORD]&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;
  16800.    &lt;span class=&quot;nv&quot;&gt;COMPREPLY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=()&lt;/span&gt;
  16801.  
  16802.    &lt;span class=&quot;nv&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;0
  16803.    &lt;span class=&quot;k&quot;&gt;while &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;read&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; line&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do
  16804.        &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;expr&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt; + 1&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;
  16805.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-lt&lt;/span&gt; 5 &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then continue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fi&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# skip the header lines&lt;/span&gt;
  16806.            &lt;span class=&quot;nv&quot;&gt;room&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$line&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;cut&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos; &apos;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-f&lt;/span&gt; 7-100 &lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
  16807.            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$room&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;grep&lt;/span&gt; ^&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;cur&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;.&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; /dev/null&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
  16808.                &lt;/span&gt;COMPREPLY+&lt;span class=&quot;o&quot;&gt;=(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;printf&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;%q&quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$room&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
  16809.            &lt;span class=&quot;k&quot;&gt;fi&lt;/span&gt;
  16810.  
  16811.        &lt;span class=&quot;c&quot;&gt;# break if no more items will follow (e.g. Flags != 3)&lt;/span&gt;
  16812.        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$line&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;cut&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos; &apos;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-f&lt;/span&gt; 3&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-ne&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;3&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
  16813.            &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;break
  16814.        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fi
  16815.    done&lt;/span&gt; &amp;lt; &amp;lt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;dns-sd &lt;span class=&quot;nt&quot;&gt;-B&lt;/span&gt; _airplay._tcp&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
  16816. &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  16817.  
  16818. &lt;span class=&quot;nb&quot;&gt;complete&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; nospace &lt;span class=&quot;nt&quot;&gt;-F&lt;/span&gt; _aircontrol aircontrol&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16819.  
  16820. &lt;p&gt;To wrap it up, we added an option to stop the mirroring: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-k&lt;/code&gt;.&lt;/p&gt;
  16821.  
  16822. &lt;p&gt;Here it is in action:&lt;/p&gt;
  16823.  
  16824. &lt;p&gt;&lt;img src=&quot;/images/post_images/aircontrol.gif&quot; alt=&quot;aircontrol in action&quot; /&gt;&lt;/p&gt;
  16825.  
  16826. &lt;p&gt;The final result is available here:
  16827. &lt;a href=&quot;https://github.com/AdRoll/AirControl&quot;&gt;https://github.com/AdRoll/AirControl&lt;/a&gt;&lt;/p&gt;
  16828.  
  16829. &lt;p&gt;As you can see, not that much code at all.&lt;/p&gt;
  16830.  
  16831. &lt;p&gt;Starting from this, one could build a workflow for
  16832. &lt;a href=&quot;http://www.alfredapp.com/&quot;&gt;Alfred&lt;/a&gt; or a &lt;a href=&quot;https://developer.apple.com/library/mac/documentation/Carbon/Conceptual/MDImporters/MDImporters.html#//apple_ref/doc/uid/TP40001279-BBCFBCAG&quot;&gt;plugin for
  16833. Spotlight&lt;/a&gt;.
  16834. A Spotlight extension would be an interesting hack since it’s designed for
  16835. files, but that will have to wait for another Hack Day…&lt;/p&gt;
  16836.  
  16837. </description>
  16838.    </item>
  16839.    
  16840.    
  16841.    
  16842.    <item>
  16843.      <title>Bulk Loading Multiple Tables</title>
  16844.      <link>https://tech.nextroll.com/blog/data/2014/07/15/multi-table-bulk-import.html</link>
  16845.      <pubDate>Tue, 15 Jul 2014 00:00:00 -0700</pubDate>
  16846.      <author></author>
  16847.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2014/07/15/multi-table-bulk-import</guid>
  16848.      <description>&lt;p&gt;AdRoll’s customer dashboard is powered by our &lt;a href=&quot;http://hbase.apache.org/&quot;&gt;HBase&lt;/a&gt; cluster, populated using both &lt;a href=&quot;http://storm.incubator.apache.org/&quot;&gt;Storm&lt;/a&gt; and &lt;a href=&quot;http://hadoop.apache.org/&quot;&gt;Hadoop MapReduce&lt;/a&gt;. Using Storm allows us to provide real-time statistics to our users, while Hadoop gives us the accuracy guarantees needed for billing. Throughout the day, Storm emits a steady stream of writes to HBase. Our MapReduce jobs however, run once a day over the previous day’s data. This generates a huge spike in write traffic that can drastically slow down, if not render the entire cluster unresponsive. To counteract this, we switched to having our MapReduce jobs &lt;a href=&quot;http://hbase.apache.org/book/arch.bulk.load.html&quot;&gt;bulk load&lt;/a&gt;, skipping the write path entirely and saving both CPU and network IO. Unfortunately, we could not use the built-in bulk loading tools because of our non-standard use case.&lt;/p&gt;
  16849.  
  16850. &lt;hr /&gt;
  16851.  
  16852. &lt;p&gt;Each type of event we see is demultiplexed into a write for one or more tables. For example, a click event might lead to incrementing a counter based on which ad and campaign generated the click. For bidding purposes, we may also want to keep track of which bidding strategy was used and which site the ad was placed.&lt;/p&gt;
  16853.  
  16854. &lt;p&gt;&lt;img src=&quot;/images/post_images/hbase_demultiplex.png&quot; alt=&quot;Multiplex&quot; /&gt;&lt;/p&gt;
  16855.  
  16856. &lt;p&gt;Unfortunately, with the way &lt;a href=&quot;http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.94.17/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java&quot;&gt;HFileOutputFormat&lt;/a&gt; is written, there was no way to do what we wanted without running over the same input multiple times. This is because HFileOutputFormat can only generate output for one table at a time. The amount of new data AdRoll collects and generates (on the order of 25TB/day) makes this a very unattractive, if not untenable option. Luckily, we have access to the source code so it was fairly painless to subclass the output format and add the ability to generate HFiles for multiple tables in one pass.&lt;/p&gt;
  16857.  
  16858. &lt;p&gt;Before we dive into the technical details, we need to understand at a high level how HFileOutputFormat.&lt;a href=&quot;http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.94.17/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java#HFileOutputFormat.configureIncrementalLoad%28org.apache.hadoop.hbase.mapreduce.Job%2Corg.apache.hadoop.hbase.client.HTable%29&quot;&gt;configureIncrementalLoad()&lt;/a&gt; works. Cloudera has a &lt;a href=&quot;http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/&quot;&gt;great overview of bulk loading&lt;/a&gt; – we mostly care about section 2 copied below.&lt;/p&gt;
  16859.  
  16860. &lt;blockquote&gt;
  16861.  &lt;p&gt;The job [Mapper] will need to emit the row key as the Key, and either a KeyValue, a Put, or a Delete as the Value. The Reducer is handled by HBase; you configure it using HFileOutputFormat.configureIncrementalLoad() and it does the following:&lt;/p&gt;
  16862.  
  16863.  &lt;ol&gt;
  16864.    &lt;li&gt;Inspects the table to configure a total order partitioner&lt;/li&gt;
  16865.    &lt;li&gt;Uploads the partitions file to the cluster and adds it to the DistributedCache&lt;/li&gt;
  16866.    &lt;li&gt;Sets the number of reduce tasks to match the current number of regions&lt;/li&gt;
  16867.    &lt;li&gt;Sets the output key/value class to match HFileOutputFormat’s requirements&lt;/li&gt;
  16868.    &lt;li&gt;Sets the reducer up to perform the appropriate sorting (either KeyValueSortReducer or PutSortReducer)&lt;/li&gt;
  16869.  &lt;/ol&gt;
  16870.  
  16871.  &lt;p&gt;At this stage, one HFile will be created per region in the output folder.&lt;/p&gt;
  16872. &lt;/blockquote&gt;
  16873.  
  16874. &lt;p&gt;Our multi-table HFileOutputFormat (&lt;a href=&quot;https://gist.github.com/wesleychowadroll/d3b0ee1e85c445379a32&quot;&gt;source&lt;/a&gt;) is going to do exactly this, but demultiplex each key into the appropriate HFile depending on destination table. The convention we have adopted is for each row key to be prepended with the table name. Colons are not legal in HBase table names, so they are safe to be used as separators. Step 1 from above then changes a little bit, but the general idea is to get each region start key and prepend the table name to it.&lt;/p&gt;
  16875.  
  16876. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-ruby&quot; data-lang=&quot;ruby&quot;&gt;&lt;span class=&quot;n&quot;&gt;partition_keys&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
  16877. &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table_name&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;tables:
  16878.  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_keys&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_region_start_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  16879.  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start_key&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;start_keys:
  16880.    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;partition_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;:&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  16881.  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  16882. &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  16883. &lt;span class=&quot;n&quot;&gt;configure_total_order_partitioner&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;partition_keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16884.  
  16885. &lt;p&gt;Step 3 will set the number of reduce tasks to the total number of regions across all tables. The actual HFile demultiplexing happens in the output format RecordWriter (lines &lt;a href=&quot;https://gist.github.com/wesleychowadroll/d3b0ee1e85c445379a32#file-multitablehfileoutputformat-java-L292-L304&quot;&gt;292-304&lt;/a&gt; and &lt;a href=&quot;https://gist.github.com/wesleychowadroll/d3b0ee1e85c445379a32#file-multitablehfileoutputformat-java-L322&quot;&gt;322&lt;/a&gt;). Instead of just writing to the output path, we separate out the table name and the row key, then write each table’s HFiles into it’s own folder. To use our new output format, all we have to do is change the mapper to emit &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;table&amp;gt;:&amp;lt;row key&amp;gt;&lt;/code&gt; instead of just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;row key&amp;gt;&lt;/code&gt;.&lt;/p&gt;
  16886.  
  16887. &lt;hr /&gt;
  16888.  
  16889. &lt;p&gt;The standard usage of HFileOutputFormat doesn’t allow any work to be done in the reduce step, as configureIncrementalLoad will set the reducer to be either KeyValueSortReducer or PutSortReducer. If we need to do work in the reduce step, we have two options. The first option is to have a two-step MapReduce job. The first reducer writes to a sequence file, which the second map reads and emits into a sorting reducer configured by HFileOutputFormat. The second option is a bit more efficient as it does all of the necessary work in one reducer. If we take a look at what the sorting reducers are doing, we see that all it does is take the KeyValues and emit them in sorted order. It’s easy then, to modify our reducer to do some useful work combining KeyValues, then sort and emit all in one step.&lt;/p&gt;
  16890.  
  16891. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot; data-lang=&quot;java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;reduce&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;ImmutableBytesWritable&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kw&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Iterable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Writable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;values&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  16892.  
  16893.  &lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;KeyValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kvs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;someUsefulWork&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kw&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;values&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
  16894.  
  16895.  &lt;span class=&quot;nc&quot;&gt;TreeSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;KeyValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;TreeSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;KeyValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;KeyValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;COMPARATOR&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
  16896.  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;KeyValue&lt;/span&gt; &lt;span class=&quot;nl&quot;&gt;kv:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kvs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  16897.    &lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;clone&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
  16898.  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  16899.  
  16900.  &lt;span class=&quot;c1&quot;&gt;// Write out the values in order&lt;/span&gt;
  16901.  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;KeyValue&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  16902.    &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kw&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
  16903.  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  16904. &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  16905.  
  16906. &lt;p&gt;That’s all there is to it! With these simple modifications, AdRoll is able to efficiently process and load large amounts of data into our HBase cluster. We are free to denormalize our data as needed, without worrying about many of the associated costs.&lt;/p&gt;
  16907.  
  16908. &lt;hr /&gt;
  16909.  
  16910. </description>
  16911.    </item>
  16912.    
  16913.    
  16914.    
  16915.    <item>
  16916.      <title>Hadoop at AdRoll</title>
  16917.      <link>https://tech.nextroll.com/blog/events/2014/06/11/june-sf-hadoop-meetup.html</link>
  16918.      <pubDate>Wed, 11 Jun 2014 00:00:00 -0700</pubDate>
  16919.      <author></author>
  16920.      <guid isPermaLink="false">https://tech.nextroll.com/blog/events/2014/06/11/june-sf-hadoop-meetup</guid>
  16921.      <description>&lt;p&gt;Earlier this week, AdRoll hosted a group of Hadoop users and developers for their &lt;a href=&quot;http://www.meetup.com/hadoopsf/&quot;&gt;monthly meetup&lt;/a&gt;. A member of our data team presented AdRoll’s infrastructure and described how we combine Hadoop with Storm and HBase to achieve both high accuracy and low latency. Here are the slides from that talk.&lt;/p&gt;
  16922.  
  16923. &lt;script async=&quot;&quot; class=&quot;speakerdeck-embed&quot; data-id=&quot;30863080d49101316bc66ac2ad2297d3&quot; data-ratio=&quot;1.33333333333333&quot; src=&quot;//speakerdeck.com/assets/embed.js&quot;&gt;&lt;/script&gt;
  16924.  
  16925. </description>
  16926.    </item>
  16927.    
  16928.    
  16929.    
  16930.    <item>
  16931.      <title>Valentino Volonghi Presents AdRoll&apos;s RTB Infrastructure at AWS Summit 2014</title>
  16932.      <link>https://tech.nextroll.com/blog/web/2014/04/29/valentino-presents-adrolls-rtb-infrastructure.html</link>
  16933.      <pubDate>Tue, 29 Apr 2014 00:00:00 -0700</pubDate>
  16934.      <author></author>
  16935.      <guid isPermaLink="false">https://tech.nextroll.com/blog/web/2014/04/29/valentino-presents-adrolls-rtb-infrastructure</guid>
  16936.      <description>&lt;p&gt;Continuing with our video posts, here is Valentino once again, this time presenting at the 2014 AWS Summit in San Francisco. Valentino discusses how to handle 50 billion requests per day without breaking the bank by optimizing our AWS operations.&lt;/p&gt;
  16937.  
  16938. &lt;p&gt;In addition to low cost, AWS allows AdRoll’s RTB infrastructure to achieve nearly 100% uptime with sub-100ms latency and less than 0.15% timeouts. Learn how we built out a globally scaled, highly available architecture that powers our advertising operations and pushes the limits of cloud computing.&lt;/p&gt;
  16939.  
  16940. &lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/0TsnCR6JMIM?start=1159&amp;amp;end=2750&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;
  16941.  
  16942. &lt;hr /&gt;
  16943. </description>
  16944.    </item>
  16945.    
  16946.    
  16947.    
  16948.    <item>
  16949.      <title>Werner Vogels&apos; Fireside Chat with Valentino</title>
  16950.      <link>https://tech.nextroll.com/blog/web/2014/04/22/werner-vogels-fireside-chat-with-valentino.html</link>
  16951.      <pubDate>Tue, 22 Apr 2014 00:00:00 -0700</pubDate>
  16952.      <author></author>
  16953.      <guid isPermaLink="false">https://tech.nextroll.com/blog/web/2014/04/22/werner-vogels-fireside-chat-with-valentino</guid>
  16954.      <description>&lt;p&gt;Better late than never, but we wanted to share a video from last fall’s AWS re:Invent conference where our own chief architect Valentino Volonghi shared the stage with Amazon CTO &lt;a href=&quot;http://en.wikipedia.org/wiki/Werner_Vogels&quot;&gt;Werner Vogels&lt;/a&gt; for a fireside chat. Valentino discusses AdRoll’s use of &lt;a href=&quot;http://aws.amazon.com/dynamodb/&quot;&gt;DynamoDB&lt;/a&gt;, one of the core technologies we leverage to build our real-time bidding system that handles more than 50 billion bid requests per day.&lt;/p&gt;
  16955.  
  16956. &lt;p&gt;The resulting system is a highly-reliable, low-latency, seamlessly scalable framework that allows AdRoll to crunch data, determine a bid-price, and select an ad to serve within a 40ms auction window. So pull up a seat with Valentino and Werner and learn how AdRoll does all this while spending less on DynamoDB than we do on snacks.&lt;/p&gt;
  16957.  
  16958. &lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/ltMoyUADr7g?start=2244&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;
  16959.  
  16960. &lt;hr /&gt;
  16961. </description>
  16962.    </item>
  16963.    
  16964.    
  16965.    
  16966.    <item>
  16967.      <title>How to add JSCS to your pre-commit hook</title>
  16968.      <link>https://tech.nextroll.com/blog/web/2014/03/05/adding-jscs-to-your-commit-hook.html</link>
  16969.      <pubDate>Wed, 05 Mar 2014 00:00:00 -0800</pubDate>
  16970.      <author></author>
  16971.      <guid isPermaLink="false">https://tech.nextroll.com/blog/web/2014/03/05/adding-jscs-to-your-commit-hook</guid>
  16972.      <description>&lt;p&gt;I recently came across &lt;a href=&quot;https://github.com/mdevils/node-jscs&quot;&gt;JSCS&lt;/a&gt; and thought
  16973. it would be a great addition to our git flow to enforce consistent code style
  16974. for all of our JavaScript code.&lt;/p&gt;
  16975.  
  16976. &lt;p&gt;JSCS is a code style checker for JavaScript. It lets you decide if you prefer
  16977. your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;if&lt;/code&gt;’s with or without a space after them; if you’re OK with declaring
  16978. multiple variables with one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var&lt;/code&gt; keyword; if your lines shouldn’t go over 80
  16979. characters; etc. A lot of it is a matter of taste of course, but the main goal
  16980. is consistency.&lt;/p&gt;
  16981.  
  16982. &lt;p&gt;So far we’ve been pretty good at having a fairly homogeneous style in our
  16983. JavaScript files, but as our team grows, new people need to adapt to the implied
  16984. preferences of our code base. Adding JSCS means that our style guide is codified
  16985. in JSCS’ config file and less prone to interpretation.&lt;/p&gt;
  16986.  
  16987. &lt;p&gt;Having it as part of our pre-commit hook results in less time spent in code
  16988. reviews saying things like “if you don’t mind, you missed a space here”. Some
  16989. might think it’s nitpicky, but we’re looking at it with the &lt;a href=&quot;http://en.wikipedia.org/wiki/Broken_windows_theory&quot;&gt;broken windows
  16990. theory&lt;/a&gt; in mind.&lt;/p&gt;
  16991.  
  16992. &lt;hr /&gt;
  16993.  
  16994. &lt;p&gt;So, here is the addition we made to our precommit hook to run JSCS on any
  16995. JavaScript files that are staged:&lt;/p&gt;
  16996.  
  16997. &lt;noscript&gt;&lt;pre&gt;# Extract staged files to a temp directory
  16998. TMPDIR=
  16999. TMPFILE=
  17000. if [ &amp;quot;$IS_LINUX&amp;quot; == &amp;quot;true&amp;quot; ]; then
  17001.    TMPFILE=`mktemp jscs_tmp_XXXXXX`
  17002.    TMPDIR=`mktemp -d jscs_tmp_XXXXXX`
  17003. else
  17004.    TMPFILE=`mktemp -t tmp/jscs_tmp_XXXXXX`
  17005.    TMPDIR=`mktemp -t tmp/jscs_tmp -d`
  17006. fi
  17007. git diff --cached --name-only --diff-filter=ACMR | xargs git checkout-index --prefix=$TMPDIR/ --
  17008.  
  17009. # Check JavaScript code style
  17010. RUN_JSCS=1
  17011. TOTAL_ERRORS=0
  17012.  
  17013. JSFILES=$(git diff-index --name-status --cached HEAD | grep -v ^D | egrep &amp;#39;.js$&amp;#39; | cut -c3-)
  17014. if [ -z &amp;quot;$JSFILES&amp;quot; ]; then
  17015.    # No JavaScript file changed for this commit
  17016.    RUN_JSCS=0
  17017. elif [ -z &amp;quot;$JSCS_PATH&amp;quot; ]; then
  17018.    echo &amp;quot;Warning: You can&amp;#39;t check the JS coding style.&amp;quot;
  17019.    echo &amp;quot;You need to download and install jscs and set JSCS_PATH to its path.&amp;quot;
  17020.    RUN_JSCS=0
  17021. fi
  17022.  
  17023. # Ensuring proper coding style
  17024. if [ $RUN_JSCS -ne 0 ]; then
  17025.    echo -n &amp;quot;Checking JS style errors...&amp;quot;
  17026.    OUT=`$JSCS_PATH -r text $TMPDIR`
  17027.    CODE=$?
  17028.    # Erase last output line
  17029.    echo -ne &amp;#39;\r\033[K&amp;#39;
  17030.    if [ $CODE -ne 0 ]; then
  17031.        # Replace temp file name with real filename with color
  17032.        # in sed we use commas as separators for clarity, and execute echo
  17033.        # in the replacement part to get colors
  17034.        # Probably cleaner ways exist but I don&amp;#39;t know them at the moment
  17035.        OUT=`echo -e &amp;quot;$OUT&amp;quot; | sed &amp;quot;s,$TMPDIR/\([^ ]*\),\`echo -e \&amp;quot;\033[1;32m\1\033[0m\&amp;quot;\`,&amp;quot;`
  17036.  
  17037.        # grab the number of errors for that file
  17038.        # (keeps only numbers and takes the last line)
  17039.        TOTAL_ERRORS=`echo -e &amp;quot;$OUT&amp;quot; | sed &amp;#39;s/[^0-9]//g&amp;#39; | tail -1`
  17040.  
  17041.        # echo output minus last line
  17042.        echo -e &amp;quot;\033[1;37mJavaScript style errors found:\033[0m&amp;quot;
  17043.        echo -e &amp;quot;$OUT&amp;quot; | sed &amp;#39;$ d&amp;#39;
  17044.  
  17045.        echo &amp;quot;$TOTAL_ERRORS code style errors found.&amp;quot;
  17046.        echo &amp;quot;Please fix and stage the files before commiting again.&amp;quot;
  17047.        rm -Rf $TMPDIR $TMPFILE
  17048.        exit $CODE
  17049.    else
  17050.        echo &amp;quot;No JS code style errors found.&amp;quot;
  17051.    fi
  17052. fi
  17053.  
  17054. # Clean up
  17055. rm -Rf $TMPDIR $TMPFILE
  17056. &lt;/pre&gt;&lt;/noscript&gt;
  17057. &lt;script src=&quot;https://gist.github.com/Timothee/8326f5544e7645f605b5.js?file=pre-commit.sh&quot;&gt; &lt;/script&gt;
  17058.  
  17059. &lt;p&gt;It’s fairly simple: we use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git checkout-index&lt;/code&gt; to output the staged files to a
  17060. temporary folder, run JSCS against it and print out any errors, and stop the
  17061. commit if there are any errors.&lt;/p&gt;
  17062.  
  17063. &lt;p&gt;The main drawback is that you can only really run JSCS against a full file and
  17064. not just the diff. So even if you change only one line in a file, you’ll have to
  17065. fix all the existing errors in that file, and the related pull request will be
  17066. more cluttered than necessary. But that should only be a transitional period:
  17067. once all your files have been cleaned up, you will get errors only on the
  17068. changes you’ve made.&lt;/p&gt;
  17069.  
  17070. &lt;p&gt;Here are some tips to keep the initial cleanup manageable:&lt;/p&gt;
  17071.  
  17072. &lt;ul&gt;
  17073.  &lt;li&gt;Cleanup your whole codebase in one big pull request containing only style
  17074. fixes. We did that mostly but split the cleanup by modules: models first, then
  17075. collections, etc.&lt;/li&gt;
  17076.  &lt;li&gt;On GitHub, on a pull request or commit page, you can add &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;?w=1&lt;/code&gt; to the URL to
  17077. hide all whitespace-related diffs (from &lt;a href=&quot;https://github.com/blog/967-github-secrets&quot;&gt;GitHub’s
  17078. blog&lt;/a&gt;). This won’t cover all the
  17079. changes caused by JSCS, but probably most of them.&lt;/li&gt;
  17080.  &lt;li&gt;Improve the hook to only keep errors for the staged lines. This is a bit
  17081. tricky to write and probably not worth the hassle since eventually we’d like
  17082. to have the whole project properly styled.&lt;/li&gt;
  17083. &lt;/ul&gt;
  17084.  
  17085. &lt;h2 id=&quot;how-to-add-this-to-your-pre-commit-hook&quot;&gt;How to add this to your pre-commit hook&lt;/h2&gt;
  17086.  
  17087. &lt;ol&gt;
  17088.  &lt;li&gt;Install JSCS (see
  17089. &lt;a href=&quot;https://github.com/mdevils/node-jscs#installation&quot;&gt;instructions&lt;/a&gt;).&lt;/li&gt;
  17090.  &lt;li&gt;Create a file named &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.jscs.json&lt;/code&gt; with your &lt;a href=&quot;https://github.com/mdevils/node-jscs#configuration&quot;&gt;preferred
  17091. options&lt;/a&gt;.  As a starting
  17092. point, you can check &lt;a href=&quot;https://gist.github.com/Timothee/96e944f99f2b97fc162f&quot;&gt;our current
  17093. configuration&lt;/a&gt;.&lt;/li&gt;
  17094.  &lt;li&gt;Paste the script above to your pre-commit hook in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.git/pre-commit&lt;/code&gt;. (see the
  17095. &lt;a href=&quot;http://git-scm.com/book/en/Customizing-Git-Git-Hooks&quot;&gt;git documentation&lt;/a&gt; for
  17096. more details) You’ll need to add &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;JSCS_PATH&lt;/code&gt; to your environment, pointing to
  17097. where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jscs&lt;/code&gt; lives.&lt;/li&gt;
  17098.  &lt;li&gt;Proceed to commit perfectly styled code!&lt;/li&gt;
  17099. &lt;/ol&gt;
  17100.  
  17101. </description>
  17102.    </item>
  17103.    
  17104.    
  17105.    
  17106.    <item>
  17107.      <title>Speaking of Erlang and AdRoll...</title>
  17108.      <link>https://tech.nextroll.com/blog/erlang/2014/01/27/speaking-of-erlang-and-adroll.html</link>
  17109.      <pubDate>Mon, 27 Jan 2014 00:00:00 -0800</pubDate>
  17110.      <author></author>
  17111.      <guid isPermaLink="false">https://tech.nextroll.com/blog/erlang/2014/01/27/speaking-of-erlang-and-adroll</guid>
  17112.      <description>&lt;p&gt;Speaking of &lt;a href=&quot;http://tech.adroll.com/blog/erlang/2014/01/22/monitoring-with-exometer-at-adroll.html&quot;&gt;Erlang and AdRoll&lt;/a&gt;… now is probably a good time to let you know AdRoll will be acting as a sponsor at the Erlang Factory SF Bay Area Conference this coming March.  Our very own Brian Troutwine (&lt;a href=&quot;https://twitter.com/bltroutwine&quot;&gt;@bltroutwine&lt;/a&gt;) will be giving a talk on the complex problems involved in live monitoring our real-time bidding system.  And just to put that in perspective a bit, we’re talking about live monitoring everything that could possibly go wrong on a system that receives 500K bid requests per second; requests which have the potential to spend a non-trivial amount of actual real-life company dollars in the blink of an eye.&lt;/p&gt;
  17113.  
  17114. &lt;p&gt;Be sure to get your ticket to the conference now, so you can catch Brian’s talk and find out how we try to get as close to bulletproof monitoring as possible, not to mention all the other great talks on the schedule.&lt;/p&gt;
  17115.  
  17116. &lt;hr /&gt;
  17117.  
  17118. &lt;p&gt;&lt;img src=&quot;/images/post_images/erlfactory-logo.png&quot; alt=&quot;ErlFactory Logo&quot; /&gt;&lt;/p&gt;
  17119.  
  17120. &lt;hr /&gt;
  17121.  
  17122. &lt;p&gt;The Erlang Factory SF Bay area is the largest event in the US dedicated to the Erlang programming language, and will feature over 50 speakers on 8 tracks. Talk topics will cover areas such as Reactive Programming, Analytics, Big Data, DevOps, Interoperationabilty, Infrastructure, Multi-core and Architectures.&lt;/p&gt;
  17123.  
  17124. &lt;p&gt;Speakers include:&lt;/p&gt;
  17125.  
  17126. &lt;ul&gt;
  17127.  &lt;li&gt;Bruce Tate - author of ‘Seven Languages in Seven weeks’&lt;/li&gt;
  17128.  &lt;li&gt;Dave Thomas - author of The Pragmatic Programmer&lt;/li&gt;
  17129.  &lt;li&gt;Bob Ippolito - founder of Mochi Media&lt;/li&gt;
  17130.  &lt;li&gt;Rick Reed - software engineer at WhatsApp&lt;/li&gt;
  17131.  &lt;li&gt;Brett Cameron - senior software architect with HP’s corporate Cloud System&lt;/li&gt;
  17132.  &lt;li&gt;Stuart Bailey - CTO of Infoblox&lt;/li&gt;
  17133.  &lt;li&gt;Erik Stenman - chief scientist at Klarna&lt;/li&gt;
  17134.  &lt;li&gt;Duncan McGregor - senior manager at Rackspace&lt;/li&gt;
  17135.  &lt;li&gt;and many, many more&lt;/li&gt;
  17136. &lt;/ul&gt;
  17137.  
  17138. &lt;p&gt;Elixir’s inventor, José Valim – also a Ruby on Rails Core Team member, will give a joint keynote with Dave Thomas - author of The Pragmatic Programmer. Mike Williams, co-inventor of the Erlang programming language, will deliver the second keynote.&lt;/p&gt;
  17139.  
  17140. &lt;p&gt;Besides the already traditional Erlang Express and Erlang OTP courses, this year participants are also offered courses on Elixir, Cowboy, Riak and Kazoo.&lt;/p&gt;
  17141.  
  17142. &lt;p&gt;Use the code &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ADROLL10&lt;/code&gt; to get a 10% discount from the conference price. The discount applies to Early Bird rates until 11 February and to regular rates until the 6th of March.&lt;/p&gt;
  17143.  
  17144. &lt;p&gt;You can read more and get your tickets at the &lt;a href=&quot;http://www.erlang-factory.com/conference/show/conference-6/home/#home&quot;&gt;Erlang  Factory Conference Site&lt;/a&gt;.&lt;/p&gt;
  17145. </description>
  17146.    </item>
  17147.    
  17148.    
  17149.    
  17150.    <item>
  17151.      <title>Monitoring with Exometer: An Ongoing Love Story</title>
  17152.      <link>https://tech.nextroll.com/blog/erlang/2014/01/22/monitoring-with-exometer-at-adroll.html</link>
  17153.      <pubDate>Wed, 22 Jan 2014 00:00:00 -0800</pubDate>
  17154.      <author></author>
  17155.      <guid isPermaLink="false">https://tech.nextroll.com/blog/erlang/2014/01/22/monitoring-with-exometer-at-adroll</guid>
  17156.      <description>&lt;p&gt;Welcome, friends, to the first AdRoll tech blog of 2014! Last night AdRoll
  17157. hosted &lt;a href=&quot;http://www.meetup.com/ErlangSF/events/159719882/&quot;&gt;Erlounge&lt;/a&gt; here in our
  17158. Mission street offices; seventeen Erlangers showed up and three talks were
  17159. given. Juan Puig of Linden Labs gave a lightening tutorial on the use of
  17160. quickcheck in Erlang, the slides of which you can find
  17161. &lt;a href=&quot;https://docs.google.com/presentation/d/1QQUwyJwBvv6yPt_FLOgo4ZoOXKkHil6JK1_O_-IPX90/edit?usp=sharing&quot;&gt;here&lt;/a&gt;.
  17162. Marc Sugiyama of Erlang Solutions–boy, they have nice business cards–spoke
  17163. about debugging techniques in long-lived Erlang systems. To my knowledge the
  17164. slides aren’t yet online, but if you can corner Marc at a conference do so.&lt;/p&gt;
  17165.  
  17166. &lt;p&gt;As I completely spaced on getting these talks recorded–sorry, next time–I
  17167. present my slides with comments interspersed.&lt;/p&gt;
  17168.  
  17169. &lt;hr /&gt;
  17170.  
  17171. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.001.jpg&quot; alt=&quot;Slide the 001&quot; /&gt;&lt;/p&gt;
  17172.  
  17173. &lt;p&gt;For my part, I spoke about the Real-Time Bidding (RTB) monitoring effort I’ve
  17174. been leading here at AdRoll and, in particular, the role of Ulf Wiger and Magnus
  17175. Feuer’s &lt;a href=&quot;https://github.com/Feuerlabs/exometer&quot;&gt;exometer&lt;/a&gt; library in this work.&lt;/p&gt;
  17176.  
  17177. &lt;hr /&gt;
  17178.  
  17179. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.002.jpg&quot; alt=&quot;Slide the 002&quot; /&gt;&lt;/p&gt;
  17180.  
  17181. &lt;p&gt;The dirty thing about monitoring is that it’s tangential to value-added work. It
  17182. provides no direct value to customers and is, well, &lt;em&gt;expensive&lt;/em&gt;. Now, that’s not
  17183. to say monitoring isn’t important. It is. What we’re looking to do here is
  17184. safeguard our investments: investments in engineering time, customer
  17185. acquisition, in computing hardware and human life. (Happily, AdRoll doesn’t have
  17186. the responsibility of life and death, but we can engineer our systems to waste
  17187. people’s time, which they’ll never get back.) Monitoring complex systems is
  17188. playing the game of keeping ahead of work-stopping disaster. (Incidentally, I
  17189. highly recommend Charles Perrow’s
  17190. &lt;a href=&quot;http://www.amazon.com/Normal-Accidents-Living-High-Risk-Technologies/dp/0691004129&quot;&gt;Normal Accidents: Living with High-Risk Technologies&lt;/a&gt;
  17191. and Eric Schlosser’s more recent
  17192. &lt;a href=&quot;http://www.amazon.com/Command-Control-Damascus-Accident-Illusion/dp/1594202273&quot;&gt;Command and Control: Nuclear Weapons, the Damascus Accident, and the Illusion of Safety&lt;/a&gt;
  17193. if you’d like to read more in-depth on this subject.)&lt;/p&gt;
  17194.  
  17195. &lt;hr /&gt;
  17196.  
  17197. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.003.jpg&quot; alt=&quot;Slide the 003&quot; /&gt;&lt;/p&gt;
  17198.  
  17199. &lt;p&gt;In our Erlang systems we’re on the look-out for two things primarily: red-line VM
  17200. statistics and metrics that say our applications have gone sideways. Any system
  17201. that’s going to help us monitor these things &lt;em&gt;has&lt;/em&gt; to report to multiple
  17202. backends–meaning, I need to pump data through our sophisticated Big Data
  17203. pipeline which I barely understand, out to on-disk logs and to our metrics
  17204. aggregation partner, &lt;a href=&quot;http://www.datadoghq.com/&quot;&gt;DataDog&lt;/a&gt;, via their statsd
  17205. client–and cannot introduce excessive overhead into the systems.&lt;/p&gt;
  17206.  
  17207. &lt;p&gt;The RTB system, for instance, handles approximately 500,000 bid events per
  17208. second and must respond to each one, without fail, in 100 milliseconds. Let that
  17209. sink in for a bit. A few days ago we peaked 30 billion bid events in a single
  17210. day. Let &lt;em&gt;that&lt;/em&gt; sink in. Any monitoring library we use is going to have to keep
  17211. up with that load and, as few people have that load to play with during
  17212. development, needs to be hackable enough for us to make changes. This implies
  17213. the authors have to be friendly and open to patches and blind “uh, I dunno, it
  17214. just doesn’t work” issue tickets.&lt;/p&gt;
  17215.  
  17216. &lt;hr /&gt;
  17217.  
  17218. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.004.jpg&quot; alt=&quot;Slide the 004&quot; /&gt;&lt;/p&gt;
  17219.  
  17220. &lt;p&gt;We know what we need from a system–the ability to track two kinds of
  17221. statistics, configurable reporting and low-overhead–but what is it we’re
  17222. spending all this time building to avoid? What are the disasters inherent in the
  17223. system?&lt;/p&gt;
  17224.  
  17225. &lt;hr /&gt;
  17226.  
  17227. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.005.jpg&quot; alt=&quot;Slide 005&quot; /&gt;&lt;/p&gt;
  17228.  
  17229. &lt;p&gt;There’s the usual suspects of VM killers: atom tables that hit the max, hitting
  17230. max processes or ETS tables. Monitoring these is a matter of going through the
  17231. &lt;a href=&quot;http://www.erlang.org/doc/efficiency_guide/advanced.html#id69282&quot;&gt;System Limits&lt;/a&gt;
  17232. documentation a few times so you know what you’re up against. More difficult, if
  17233. only because it requires the intrinsic knowledge of the system engineers, are
  17234. identifying performance regressions–what’s the benchmark, anyway–and
  17235. ‘abnormal’ behavior. What’s abnormal? Usually they’re identified after the fact
  17236. but smack you right in the face once they show up. For instance, it’s &lt;em&gt;possible&lt;/em&gt;
  17237. to bid millions of dollars for individual ad slots. Exchanges don’t provide much
  17238. validation of bids except to assert that they’re well-formed. Bidding without a
  17239. safety net means opening yourself up to the possibility of blowing through your
  17240. company’s entire budget in a few minutes.&lt;/p&gt;
  17241.  
  17242. &lt;p&gt;When something goes wrong at 500,000 bids a second it goes wrong &lt;em&gt;real&lt;/em&gt; fast.&lt;/p&gt;
  17243.  
  17244. &lt;p&gt;That’s where the general class of “Surprises” comes in. Insight into such a
  17245. monstrous system is hard-won and safety-nets are specific to the class of
  17246. problems you were able to anticipate or have survived. Effective monitoring
  17247. gives engineers a chance to examine the live system and discover behavior they
  17248. didn’t expect to create. Some of the unexpected behaviors are good and you
  17249. codify them, others are disasters waiting to happen and you dutifully fix them
  17250. before it comes to tears.&lt;/p&gt;
  17251.  
  17252. &lt;hr /&gt;
  17253. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.006.jpg&quot; alt=&quot;Slide 006&quot; /&gt;&lt;/p&gt;
  17254.  
  17255. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.007.jpg&quot; alt=&quot;Slide 007&quot; /&gt;&lt;/p&gt;
  17256.  
  17257. &lt;p&gt;Like I said, we’re using exometer and it’s very easy to get started yourself.
  17258. First, it’s important to understand exometer’s terminology about itself as it’s
  17259. a bit unlike any of the other monitoring libraries available for Erlang.&lt;/p&gt;
  17260.  
  17261. &lt;p&gt;An “entry”, as the slides say, is a “receiver and aggregator of metrics”.
  17262. Consider the histogram entry. You stream number metrics into it–maybe Erlang
  17263. messages per second–and the histogram entry will build a histogram of this
  17264. data, discarding the original numbers as needed. A “reporter” is a exometer
  17265. entity capable of taking values from entries and shipping them elsewhere, across
  17266. a network in the case of the statsd reporter or, say, to your TTY. The
  17267. “subscription” defines how often a reporter–or reporters–will take values from
  17268. which entries and ship them off.&lt;/p&gt;
  17269.  
  17270. &lt;hr /&gt;
  17271.  
  17272. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.008.jpg&quot; alt=&quot;Slide 008&quot; /&gt;&lt;/p&gt;
  17273.  
  17274. &lt;p&gt;As of this writing, entries are created dynamically. Here we’re creating the
  17275. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[rtb, bodhi, metrics_srv, packets_in]&lt;/code&gt; metrics–which is a histogram whose
  17276. values expire after 60s–and a function entry. The function entry exometer
  17277. supplies is &lt;em&gt;brilliant&lt;/em&gt;. Let’s break it down:&lt;/p&gt;
  17278.  
  17279. &lt;ul&gt;
  17280.  &lt;li&gt;The exometer application will create an entry which calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erlang:system_info/1&lt;/code&gt; and&lt;/li&gt;
  17281.  &lt;li&gt;this entry will have four ‘values’: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;port_count&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;process_count&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;thread_pool_size&lt;/code&gt;.&lt;/li&gt;
  17282.  &lt;li&gt;These values will be be fed, by name, into the entry function so the metric
  17283. of the entry will the the result of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erlang:system_info(Value)&lt;/code&gt; at
  17284. inspection time.&lt;/li&gt;
  17285. &lt;/ul&gt;
  17286.  
  17287. &lt;p&gt;For instance, on one of our systems so instrumented, I find that:&lt;/p&gt;
  17288.  
  17289. &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; exometer:get_value([erlang, system_info], process_count).
  17290.  
  17291. 1065
  17292. &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
  17293.  
  17294. &lt;p&gt;Function entries mean that exometer natively supports application specific &lt;em&gt;and&lt;/em&gt;
  17295. VM specific metric gathering; it’s all just a matter of finding the right
  17296. functions to call. Even more brilliant, exometer supports a mini-language to
  17297. slice and dice function returns as needed because, remember, exometer entries
  17298. are number based. You’ll have to inspect the
  17299. &lt;a href=&quot;https://github.com/Feuerlabs/exometer/blob/master/src/exometer_function.erl#L48&quot;&gt;inline source documentation&lt;/a&gt;
  17300. to learn more.&lt;/p&gt;
  17301.  
  17302. &lt;hr /&gt;
  17303.  
  17304. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.009.jpg&quot; alt=&quot;Slide 009&quot; /&gt;&lt;/p&gt;
  17305.  
  17306. &lt;p&gt;Subscriptions are statically configured in exometer’s application configuration.
  17307. Here’ we’re creating four subscriptions: three for the
  17308. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[rtb, bodhi, metrics_srv, packets_in]&lt;/code&gt; histogram and one for the
  17309. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[erlang, system_info]&lt;/code&gt; function entry. The values of the tuple, by index, are:&lt;/p&gt;
  17310.  
  17311. &lt;ol&gt;
  17312.  &lt;li&gt;The reporter we’re subscribing to an entry, here &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exometer_report_statsd&lt;/code&gt;,&lt;/li&gt;
  17313.  &lt;li&gt;the “entry name” which matches that supplied in the entry’s creation,&lt;/li&gt;
  17314.  &lt;li&gt;the “entry value” of which &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;max&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;median&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mean&lt;/code&gt; are stock for
  17315. histograms while we created the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;port_count&lt;/code&gt; value for the function entry,&lt;/li&gt;
  17316.  &lt;li&gt;the rate on which the reporter will poll the entry for values, in
  17317. milliseconds and&lt;/li&gt;
  17318.  &lt;li&gt;a “retry” flag which is sort of a workaround for races between entry and
  17319. subscription creation. (Consult the docs but set this to true.)&lt;/li&gt;
  17320. &lt;/ol&gt;
  17321.  
  17322. &lt;p&gt;What we’ve got here now are entries collecting metrics and subscriptions to get
  17323. them shipped off to other–possibly multiple, depending on how many reporters
  17324. you configure–locations. All that’s left is configuring the reporter.&lt;/p&gt;
  17325.  
  17326. &lt;hr /&gt;
  17327.  
  17328. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.010.jpg&quot; alt=&quot;Slide 010&quot; /&gt;&lt;/p&gt;
  17329.  
  17330. &lt;p&gt;Reporter configuration is also, itself, statically set in the exometer
  17331. application configuration. This is terrifically nice in the face of exometer
  17332. application restarts, made somewhat less effective by the purely dynamic creation
  17333. of entries. (Static all the way through is under discussion.) Above we’re just
  17334. defining the network location of our friendly statsd daemon and none of that
  17335. should be terribly surprising. What &lt;em&gt;is&lt;/em&gt; worth pointing out is the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;type_map&lt;/code&gt;.
  17336. Exometer entries have no knowledge of the systems they’ll eventually be reported
  17337. to and, as such, carry no information about their ultimate coercion. The
  17338. reporter is the point of integration for this information.&lt;/p&gt;
  17339.  
  17340. &lt;p&gt;Statsd has the following native aggregation types (which exometer would call an
  17341. “entry”):&lt;/p&gt;
  17342.  
  17343. &lt;ul&gt;
  17344.  &lt;li&gt;gauges&lt;/li&gt;
  17345.  &lt;li&gt;counters&lt;/li&gt;
  17346.  &lt;li&gt;timers&lt;/li&gt;
  17347.  &lt;li&gt;histograms&lt;/li&gt;
  17348.  &lt;li&gt;meters&lt;/li&gt;
  17349. &lt;/ul&gt;
  17350.  
  17351. &lt;p&gt;You can read more &lt;a href=&quot;https://github.com/b/statsd_spec&quot;&gt;here&lt;/a&gt;. You can see in the
  17352. above that I’m shipping everything over as a ‘gauge’, including entries which
  17353. exometer is storing itself as a histogram. I’ll come back to this shortly.&lt;/p&gt;
  17354.  
  17355. &lt;hr /&gt;
  17356.  
  17357. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.011.jpg&quot; alt=&quot;Slide 011&quot; /&gt;&lt;/p&gt;
  17358.  
  17359. &lt;p&gt;A side benefit, which you may have noticed, about exometer is that it’s
  17360. extremely configurable even to the point of totally avoiding its stock reporters
  17361. and entries. When I developed &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exometer_report_statsd&lt;/code&gt; I did so by ensuring that
  17362. the module was loaded by the code-server ahead of exometer being started and
  17363. configured exometer to report into that module. Non-default entries can be
  17364. slipped in similarly via configuration to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exometer_admin&lt;/code&gt; module. I haven’t
  17365. done this yet–it’s on the schedule for two sprints from now–but it &lt;em&gt;looks&lt;/em&gt;
  17366. straightforward enough.&lt;/p&gt;
  17367.  
  17368. &lt;p&gt;Exometer is configurable enough to the point of supporting your goofy, one-off
  17369. reporters without having to fork the library yourself. That is remarkably handy
  17370. when you need it.&lt;/p&gt;
  17371.  
  17372. &lt;p&gt;I would also be remiss if I didn’t point out that Ulf and Magnus have been
  17373. &lt;em&gt;extremely&lt;/em&gt; helpful and responsive to my needs: long email questions have
  17374. received detailed answers, non-trivial issues have been worked promptly and
  17375. they’ve been receptive to the (admittedly meager) code I’ve contributed upstream
  17376. so far. I can’t thank them enough.&lt;/p&gt;
  17377.  
  17378. &lt;hr /&gt;
  17379.  
  17380. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.012.jpg&quot; alt=&quot;Slide 012&quot; /&gt;&lt;/p&gt;
  17381.  
  17382. &lt;p&gt;There are other monitoring libraries, better known and more venerable. Why did
  17383. we not choose to use them?&lt;/p&gt;
  17384.  
  17385. &lt;ul&gt;
  17386.  &lt;li&gt;A lack of built-in reporters, to use the exometer term, is killer. Folsom
  17387. deployments tend to carry around bespoke code to pull metrics out and route
  17388. it around, with various degrees of reliability.&lt;/li&gt;
  17389.  &lt;li&gt;A lack of multi-reporting, in addition, unless you build that.&lt;/li&gt;
  17390.  &lt;li&gt;A relative lack of flexibility in adding new entry types.&lt;/li&gt;
  17391.  &lt;li&gt;A lack of ‘function’ entries, which are super handy for dealing with VM
  17392. statistics.&lt;/li&gt;
  17393.  &lt;li&gt;A lack of configurable polling periods for those projects that do ship with
  17394. limited reporting.&lt;/li&gt;
  17395. &lt;/ul&gt;
  17396.  
  17397. &lt;p&gt;None of the other projects are totally lacking in all of the above, but each is
  17398. just enough to be problematic. Those that are venerable tend to be slower to
  17399. change, naturally.&lt;/p&gt;
  17400.  
  17401. &lt;hr /&gt;
  17402.  
  17403. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.013.jpg&quot; alt=&quot;Slide 013&quot; /&gt;&lt;/p&gt;
  17404.  
  17405. &lt;p&gt;If exometer is so great–and it is, go use it–then what are the downsides?&lt;/p&gt;
  17406.  
  17407. &lt;p&gt;Remember earlier when we created a histogram entry but then configured our
  17408. reporter to pump out select values as statsd gauges? Why is this? Well, statsd
  17409. assumes that &lt;em&gt;it&lt;/em&gt; will receive the stream of numbers along with a bit of
  17410. metadata that assigns an aggregation type to it. Exometer assumes that &lt;em&gt;it&lt;/em&gt; will
  17411. have this responsibility. As there’s no way to just push the state of a
  17412. histogram into statsd wholesale we fake it by pushing key points over. The
  17413. statsd ‘gauge’ is the only statsd type that assumes its “calculated at the
  17414. client rather than the server”. Honestly, I’m not sure how to resolve this. I
  17415. did &lt;a href=&quot;https://github.com/Feuerlabs/exometer/issues/8&quot;&gt;make an issue&lt;/a&gt; for it.&lt;/p&gt;
  17416.  
  17417. &lt;p&gt;As of this writing, there’s a serious space-leak in the histogram entry. This is
  17418. being &lt;a href=&quot;https://github.com/Feuerlabs/exometer/issues/7&quot;&gt;actively worked on&lt;/a&gt;.
  17419. (Update: looks like this is resolved. Fun!)&lt;/p&gt;
  17420.  
  17421. &lt;p&gt;I pointed out above that both subscriptions and reporters can be statically
  17422. configured. Entries, however, cannot be. There’s an
  17423. &lt;a href=&quot;https://github.com/Feuerlabs/exometer/issues/5&quot;&gt;issue for that&lt;/a&gt;.&lt;/p&gt;
  17424.  
  17425. &lt;p&gt;There are some reports that exometer won’t compile under R16B03 due to an
  17426. erl_syntax bug, but I believe this
  17427. &lt;a href=&quot;https://github.com/Feuerlabs/exometer/commit/ec18c9ae2d7733c18971cf5e3bd57ed62b1432e4&quot;&gt;has been resolved&lt;/a&gt;.&lt;/p&gt;
  17428.  
  17429. &lt;hr /&gt;
  17430.  
  17431. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.014.jpg&quot; alt=&quot;Slide 014&quot; /&gt;&lt;/p&gt;
  17432.  
  17433. &lt;hr /&gt;
  17434.  
  17435. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.015.jpg&quot; alt=&quot;Slide 015&quot; /&gt;&lt;/p&gt;
  17436.  
  17437. &lt;hr /&gt;
  17438.  
  17439. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.016.jpg&quot; alt=&quot;Slide 016&quot; /&gt;&lt;/p&gt;
  17440.  
  17441. &lt;hr /&gt;
  17442.  
  17443. &lt;p&gt;&lt;img src=&quot;/images/post_images/mon_exo/mon_exo.017.jpg&quot; alt=&quot;Slide 017&quot; /&gt;&lt;/p&gt;
  17444.  
  17445. &lt;p&gt;&lt;a href=&quot;http://www.erlang-factory.com/conference/show/conference-6/home/#brian-troutwine&quot;&gt;
  17446.  &lt;img src=&quot;http://www.erlang-factory.com/static/upload/media/1390389464268463300x250speaker.png&quot; alt=&quot;speaker badge&quot; /&gt;
  17447. &lt;/a&gt;&lt;/p&gt;
  17448. </description>
  17449.    </item>
  17450.    
  17451.    
  17452.    
  17453.    <item>
  17454.      <title>Lazy loading Backbone collections with Promises</title>
  17455.      <link>https://tech.nextroll.com/blog/web/2013/11/12/lazyloading-backbone-collection-with-promises.html</link>
  17456.      <pubDate>Tue, 12 Nov 2013 00:00:00 -0800</pubDate>
  17457.      <author></author>
  17458.      <guid isPermaLink="false">https://tech.nextroll.com/blog/web/2013/11/12/lazyloading-backbone-collection-with-promises</guid>
  17459.      <description>&lt;p&gt;Until recently, when our customers created a campaign, they could choose to target the whole world, the U.S. or create a list of specific countries and U.S. metros (e.g. the San Francisco Bay Area, the Atlanta metro, etc.).&lt;/p&gt;
  17460.  
  17461. &lt;p&gt;Last week, we added the ability to target &lt;em&gt;or&lt;/em&gt; exclude countries, metros, regions (e.g. states in the U.S., provinces in Canada,…), cities and postal codes. That meant doing a significant refactor of our geolocation code, which lead me to (finally) use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Deferred&lt;/code&gt; objects.&lt;/p&gt;
  17462.  
  17463. &lt;p&gt;We have under 200 countries and 250 metros in our database, so it was not a problem to bootstrap the whole list in our Backbone app on page load. However, regions are in the thousands and cities in the hundreds of thousands. We couldn’t just simply load everything anymore: we turned to lazy loading.&lt;/p&gt;
  17464.  
  17465. &lt;hr /&gt;
  17466.  
  17467. &lt;p&gt;With perfect timing, &lt;a href=&quot;https://github.com/jashkenas&quot;&gt;Jeremy Ashkenas&lt;/a&gt;’ keynote presentation for &lt;a href=&quot;http://backboneconf.com/&quot;&gt;BackboneConf&lt;/a&gt; was put online. He talks about a few Backbone patterns that he came across, including one to lazily load models in a collection. Here is the segment: (duration: ~1’30”)&lt;/p&gt;
  17468.  
  17469. &lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/P0YIdsJqKV4?start=1742&amp;amp;end=1844&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;
  17470.  
  17471. &lt;p&gt;The idea is that the collection may or may not have loaded the model you’re interested in at this moment. If it’s loaded, the collection can give it to you right away. But if it’s not, it needs to get it &lt;em&gt;asynchronously&lt;/em&gt; from the server. One way would be to have your view manage this:&lt;/p&gt;
  17472.  
  17473. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;user&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;users&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17474. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17475.    &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;renderUser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17476. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17477.    &lt;span class=&quot;nx&quot;&gt;user&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  17478.    &lt;span class=&quot;nx&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fetch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  17479.        &lt;span class=&quot;na&quot;&gt;success&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17480.            &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;renderUser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17481.        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  17482.    &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  17483. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  17484.  
  17485. &lt;p&gt;This works, but it’s hard to read, it’s messy and it doesn’t separate concerns well: the view shouldn’t need to check if a model is loaded or not. On top of that, you would need the same thing in any views that want to get a model.&lt;/p&gt;
  17486.  
  17487. &lt;hr /&gt;
  17488.  
  17489. &lt;p&gt;The solution of course is to move this logic into the collection itself, and Jeremy discusses using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Deferred&lt;/code&gt; objects instead of passing a callback to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lookup&lt;/code&gt; method.  Here is the code from his slides:&lt;/p&gt;
  17490.  
  17491. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// View code&lt;/span&gt;
  17492. &lt;span class=&quot;nx&quot;&gt;users&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;lookup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;then&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17493.    &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  17494. &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  17495.  
  17496. &lt;span class=&quot;c1&quot;&gt;// Collection code&lt;/span&gt;
  17497. &lt;span class=&quot;nl&quot;&gt;lookup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17498.    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  17499.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17500.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;resolveWith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17501.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17502.        &lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  17503.        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fetch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  17504.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  17505. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  17506.  
  17507. &lt;p&gt;The interesting part is in this line: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;return $.Deferred().resolveWith(this, model);&lt;/code&gt;. I.e. even if we have the model loaded locally, the collection returns a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Deferred&lt;/code&gt; object, but it resolves right away with the local model. This way, the interface is the same in both cases and the view doesn’t need to worry about this at all. Very neat solution.&lt;/p&gt;
  17508.  
  17509. &lt;hr /&gt;
  17510.  
  17511. &lt;p&gt;However, the code above doesn’t work as-is, so I wanted to write down how I got it to work. (admittedly, I might have missed something or Jeremy just put the slide as an illustration, not to be taken literally)&lt;/p&gt;
  17512.  
  17513. &lt;p&gt;The issue is that in one case the method returns a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Deferred&lt;/code&gt; object resolved with a Backbone model, while in the second case we return what &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;model.fetch()&lt;/code&gt; returns, which is a &lt;a href=&quot;http://api.jquery.com/jQuery.ajax/#jqXHR&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jqXHR&lt;/code&gt;&lt;/a&gt;. For our purposes, we just need to know that a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jqXHR&lt;/code&gt; is an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;XMLHttpRequest&lt;/code&gt; that implements the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Promise&lt;/code&gt; interface. When that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jqXHR&lt;/code&gt; finally resolves (when the AJAX request finishes), the function passed to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;then()&lt;/code&gt; gets &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;data, textStatus, jqXHR&lt;/code&gt; as arguments.&lt;/p&gt;
  17514.  
  17515. &lt;p&gt;What we want to return however is a Backbone model to be consistent with the first case. After some slight refactoring, here is what I have:&lt;/p&gt;
  17516.  
  17517. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;nx&quot;&gt;lookup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17518.    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  17519.    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  17520.  
  17521.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17522.        &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;resolveWith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17523.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17524.        &lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  17525.        &lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fetch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  17526.            &lt;span class=&quot;na&quot;&gt;success&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17527.                &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;resolveWith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17528.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  17529.        &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  17530.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  17531.  
  17532.    &lt;span class=&quot;c1&quot;&gt;// Returning a Promise so that only this function can modify&lt;/span&gt;
  17533.    &lt;span class=&quot;c1&quot;&gt;// the Deferred object&lt;/span&gt;
  17534.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;promise&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  17535. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  17536.  
  17537. &lt;p&gt;One of the things to note is that I don’t return a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Deferred&lt;/code&gt; object anymore but a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Promise&lt;/code&gt;. The difference is that the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Promise&lt;/code&gt; lets you add callbacks to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Deferred&lt;/code&gt; but doesn’t let you resolve it. It’s best practice in most cases to let whatever code created the object resolve it as well.&lt;/p&gt;
  17538.  
  17539. &lt;p&gt;I like this approach a lot. It’s elegant and easy to read and adapts well to different cases.&lt;/p&gt;
  17540.  
  17541. &lt;hr /&gt;
  17542.  
  17543. &lt;p&gt;For example, in my geolocation case, I don’t want only one model at a time, but instead I want to do auto-completion and get a list of geolocations that match a specific string. In some cases, the geolocation type will be completely pre-loaded (for countries and metros); in others, it will be fetched on-demand.&lt;/p&gt;
  17544.  
  17545. &lt;p&gt;The difference with the code above is that the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;success&lt;/code&gt; callback option for &lt;a href=&quot;http://backbonejs.org/#Collection-fetch&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Backbone.Collection.fetch&lt;/code&gt;&lt;/a&gt; doesn’t return the list of models that were fetched. Instead the callback arguments are &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;collection, response, options&lt;/code&gt;. So I need a bit more code.&lt;/p&gt;
  17546.  
  17547. &lt;p&gt;Here is a modified version for when you want to call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;collection.fetch&lt;/code&gt;:&lt;/p&gt;
  17548.  
  17549. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// `type`: type of geolocation&lt;/span&gt;
  17550. &lt;span class=&quot;c1&quot;&gt;// `query`: query string you want to match your models against&lt;/span&gt;
  17551. &lt;span class=&quot;c1&quot;&gt;// `this.prefetched`: array of what types are prefetched (set in initialize())&lt;/span&gt;
  17552. &lt;span class=&quot;nx&quot;&gt;search&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17553.    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  17554.    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  17555.  
  17556.    &lt;span class=&quot;c1&quot;&gt;// Here we check if the requested type is fully prefetched&lt;/span&gt;
  17557.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;prefetched&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17558.        &lt;span class=&quot;c1&quot;&gt;// match() would be implemented however you need it to&lt;/span&gt;
  17559.        &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17560.        &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;resolveWith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17561.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17562.        &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;queryData&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17563.            &lt;span class=&quot;na&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17564.            &lt;span class=&quot;na&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;
  17565.        &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  17566.        &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fetch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  17567.            &lt;span class=&quot;na&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;queryData&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17568.            &lt;span class=&quot;na&quot;&gt;success&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;collection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17569.                &lt;span class=&quot;c1&quot;&gt;// At this point, the new models will have been added to the&lt;/span&gt;
  17570.                &lt;span class=&quot;c1&quot;&gt;// collection&lt;/span&gt;
  17571.                &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;collection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17572.                &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;resolveWith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17573.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  17574.        &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  17575.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  17576.  
  17577.    &lt;span class=&quot;c1&quot;&gt;// Returning a Promise so that only this function can modify&lt;/span&gt;
  17578.    &lt;span class=&quot;c1&quot;&gt;// the Deferred object&lt;/span&gt;
  17579.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;promise&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  17580. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  17581.  
  17582. &lt;p&gt;I can now create my collection with some fully bootstrapped geolocation types (countries and metros) and some that need to be fetched on-demand and the views don’t need to know which is which.&lt;/p&gt;
  17583.  
  17584. &lt;p&gt;One drawback of having a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;match&lt;/code&gt; method in the collection is that I’m duplicating some server logic on the front end. The two could potentially clash, e.g. if my server takes the query string and sends back all cities whose name &lt;em&gt;starts&lt;/em&gt; with the query string, while the collection matches a model if the query string is &lt;em&gt;in&lt;/em&gt; the name, the displayed results won’t be consistent and may vary based on the order in which I make various queries.&lt;/p&gt;
  17585.  
  17586. &lt;p&gt;I haven’t come up with a good solution to that. I could cache the list of IDs sent back by the server for each query string but a different problem appears for the prefetched types: I would need to also create list of IDs for any substring for the prefetched models.&lt;/p&gt;
  17587.  
  17588. &lt;p&gt;For now, since I’ll be keeping a simple matching logic, duplicating the logic will do.&lt;/p&gt;
  17589.  
  17590. &lt;hr /&gt;
  17591.  
  17592. &lt;p&gt;Finally, I need to add a cache to my collection to keep track of the models I have already fetched from the server, because with the code above you could be making the same request multiple times.&lt;/p&gt;
  17593.  
  17594. &lt;p&gt;When &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;search&lt;/code&gt; is called, I just need to check if the requested type is fully prefetched. If it isn’t, I need to check if I have already queried that string for that type:&lt;/p&gt;
  17595.  
  17596. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// `type`: type of geolocation&lt;/span&gt;
  17597. &lt;span class=&quot;c1&quot;&gt;// `query`: query string you want to match your models against&lt;/span&gt;
  17598. &lt;span class=&quot;c1&quot;&gt;// `this.prefetched`: array of what types are prefetched (set in initialize())&lt;/span&gt;
  17599. &lt;span class=&quot;c1&quot;&gt;// `this.requestCache`: object that keeps a cache of the strings that have already&lt;/span&gt;
  17600. &lt;span class=&quot;c1&quot;&gt;//      been queried from the server for each geo type&lt;/span&gt;
  17601. &lt;span class=&quot;nx&quot;&gt;search&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17602.    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  17603.    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;cache&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;requestCache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
  17604.    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  17605.  
  17606.    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;prefetched&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt;
  17607.        &lt;span class=&quot;nx&quot;&gt;cache&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17608.        &lt;span class=&quot;c1&quot;&gt;// match() would be implemented however you need it to&lt;/span&gt;
  17609.        &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17610.        &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;resolveWith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17611.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17612.        &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;queryData&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17613.            &lt;span class=&quot;na&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17614.            &lt;span class=&quot;na&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17615.            &lt;span class=&quot;na&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;limit&lt;/span&gt;
  17616.        &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  17617.  
  17618.        &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;requestCache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;requestCache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{};&lt;/span&gt;
  17619.        &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;requestCache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  17620.  
  17621.        &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fetch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
  17622.            &lt;span class=&quot;na&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;queryData&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17623.            &lt;span class=&quot;na&quot;&gt;success&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;collection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17624.                &lt;span class=&quot;c1&quot;&gt;// At this point, the new models will have been added to the&lt;/span&gt;
  17625.                &lt;span class=&quot;c1&quot;&gt;// collection&lt;/span&gt;
  17626.                &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;collection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17627.                &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;resolveWith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matchingModels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17628.            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  17629.        &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  17630.    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  17631.  
  17632.    &lt;span class=&quot;c1&quot;&gt;// Returning a Promise so that only this function can modify&lt;/span&gt;
  17633.    &lt;span class=&quot;c1&quot;&gt;// the Deferred object&lt;/span&gt;
  17634.    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;deferred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;promise&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  17635. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  17636.  
  17637. &lt;p&gt;I also added a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;limit&lt;/code&gt; attribute to be able to request a certain number of objects. I save that value in a dictionary keyed on the query string, so that I know how many of the already-cached objects will match this query for that geolocation type. For example, I could have some parts of the application that need only 20 items and some where I want to show 100 items. If I already searched for 100 items for a specific query string, I don’t need to make a new query for the part of the app where I only need 20 items. This saves some additional roundtrips to the server and make the application feel snappier.&lt;/p&gt;
  17638.  
  17639. &lt;hr /&gt;
  17640.  
  17641. &lt;p&gt;I could still improve this further by delaying the call to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;fetch&lt;/code&gt; so that I can gather up queries on multiple types at once. So if I were to call:&lt;/p&gt;
  17642.  
  17643. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;nx&quot;&gt;collection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;search&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;city&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;san&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;then&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;printGeos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17644. &lt;span class=&quot;nx&quot;&gt;collection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;search&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;san&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;then&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;printGeos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  17645.  
  17646. &lt;p&gt;only one server call could be made to get results for both types. But I’ll leave that as an exercise for now.&lt;/p&gt;
  17647.  
  17648. &lt;p&gt;Another area of improvement would be to add ways to manage the cache and the size of the collection. It’s probably a good idea to be able to clear out the cache over time and to make sure the collection doesn’t grow too big. But, this too will have to wait.&lt;/p&gt;
  17649.  
  17650. &lt;hr /&gt;
  17651.  
  17652. &lt;p&gt;To conclude, I now have a lazy-loaded collection that I can partially prefetch as needed, and I limit the number of server queries by caching the query terms. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Deferred&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Promise&lt;/code&gt; objects let me present a single, elegant interface to the views.&lt;/p&gt;
  17653. </description>
  17654.    </item>
  17655.    
  17656.    
  17657.    
  17658.    <item>
  17659.      <title>Why we joined AdRoll Engineering</title>
  17660.      <link>https://tech.nextroll.com/blog/adroll/2013/10/18/why-we-joined-adroll.html</link>
  17661.      <pubDate>Fri, 18 Oct 2013 00:00:00 -0700</pubDate>
  17662.      <author></author>
  17663.      <guid isPermaLink="false">https://tech.nextroll.com/blog/adroll/2013/10/18/why-we-joined-adroll</guid>
  17664.      <description>&lt;p&gt;AdRoll is going gangbuster when it comes to hiring engineers. We’re adding headcount in a very competitive market on an ongoing basis, and we continue to differentiate ourselves with the depth and complexity of our engineering problems. We take a very collaborative approach to engineering but give individuals a high degree of project ownership. While we bounce ideas off one another and learn together, each engineer is tackling large problems directly tied to the company’s bottom line.&lt;/p&gt;
  17665.  
  17666. &lt;p&gt;Below you’ll meet a few of our college hires and learn why they chose to join AdRoll Engineering. Wesley Chow joins our data engineering team as a recent graduate of UC Berkeley EECS. Marianne “Mars” Jullian joins us from Princeton and is currently working on the front-end team.&lt;/p&gt;
  17667.  
  17668. &lt;p&gt;&lt;em&gt;Intro written by &lt;a href=&quot;mailto:jesse@adroll.com&quot;&gt;Jesse Lauro&lt;/a&gt;, an engineer with AdRoll’s Real-time Bidding team.&lt;/em&gt;&lt;/p&gt;
  17669.  
  17670. &lt;hr /&gt;
  17671.  
  17672. &lt;h1 id=&quot;wesley-chow&quot;&gt;&lt;a href=&quot;mailto:wesley.chow@adroll.com&quot;&gt;Wesley Chow&lt;/a&gt;&lt;/h1&gt;
  17673.  
  17674. &lt;p&gt;The CS job market for new graduates is very hot right now. While I don’t have any numbers to back this up, I would guess that many companies have trouble finding enough candidates to fill their open positions. Knowing this, many college seniors are picky (some would say spoiled) in choosing where they will work after graduation.  Keeping this in mind, I wanted to share why I chose specifically to work at AdRoll over my other offers.&lt;/p&gt;
  17675.  
  17676. &lt;p&gt;I had two major criteria during my job hunt. The first was that I really wanted a chance to grow into a large role within the company. My hope was that this would involve building core projects right from the start. Ideally I would have the opportunity to work on projects that would have a measurable effect on the success of the company. Clearly this is much easier to accomplish at a startup than at a more established company. With only twelve engineers at the time I interviewed (AdRoll engineering has since grown to about twenty), I felt that AdRoll fit the bill perfectly. Also, it definitely helped that AdRoll had a successful and well defined business model, contributing towards an overall sense of job security.&lt;/p&gt;
  17677.  
  17678. &lt;p&gt;My second criterion was to continue to pursue my specific interests within the field. During my last year at school, I had developed an interest in working with distributed systems. At the risk of being cliche, I wanted to work with “Big Data”. With around 500TB of data stored in Amazon S3 and approximately 10TB of data being handled per day, AdRoll certainly qualified.&lt;/p&gt;
  17679.  
  17680. &lt;p&gt;Two months in, I’ve already had the chance to touch the typical tools for storing and handling large amounts of data (Hadoop, HBase, S3, and DynamoDB), in addition to getting to play with some more cutting edge realtime processing tools such as &lt;a href=&quot;http://storm-project.net/&quot;&gt;Storm&lt;/a&gt; and &lt;a href=&quot;http://kafka.apache.org/&quot;&gt;Kafka&lt;/a&gt;. My next project promises to be a system I will get to build from the ground up that involves AdRoll’s push to go global. How’s that for an exciting challenge to a new engineer? I’m ecstatic. AdRoll has already proven to be the right choice, with a lot of long term potential.&lt;/p&gt;
  17681.  
  17682. &lt;hr /&gt;
  17683.  
  17684. &lt;h1 id=&quot;mars-jullian&quot;&gt;&lt;a href=&quot;mailto:mars.jullian@adroll.com&quot;&gt;Mars Jullian&lt;/a&gt;&lt;/h1&gt;
  17685.  
  17686. &lt;p&gt;When I first started looking for jobs in October of last year I was a little nervous about how the process would go, but by the end of it I had received four comparable offers all located in the Bay Area. Everyone’s work and mission interested me; so why did I choose AdRoll over the other three companies? Between the four offers, it eventually came down to the culture of each company and opportunities for learning.&lt;/p&gt;
  17687.  
  17688. &lt;p&gt;From a previous summer internship experience with a startup I already knew I wanted to work in a small company. In smaller startups, a fewer number of people aspire to tackle the same magnitude of problems that larger companies face. I found that there were more opportunities to learn and a greater chance of being able to make an impact. On the other hand, I was also interested in joining a larger company with some social support as I was facing a move to not only a new city, but also a completely different coast.&lt;/p&gt;
  17689.  
  17690. &lt;p&gt;AdRoll was able to offer me both: a small team where I would have an opportunity to learn and grow as a developer, being only one of a handful of engineers working on big problems, and also the support of a larger company. AdRoll has about 250 employees spread across three offices in SF, NY, and Dublin with an engineering team of only about twenty people and the team I work on, the front-end engineering team, is even smaller than that: only four people (including me). It was the startup feel I was looking for in the context of a larger and social company.&lt;/p&gt;
  17691.  
  17692. &lt;p&gt;Now that I am here at AdRoll, and six weeks in, I am so glad I made the decision that I did. Not only am I glad to be working on a small team that is part of a larger company, but the people are great and no one ever could have explained how interesting the problems the engineering team is working on are until I arrived. It’s going to be quite an adventure working here…&lt;/p&gt;
  17693. </description>
  17694.    </item>
  17695.    
  17696.    
  17697.    
  17698.    <item>
  17699.      <title>Multi-Region DynamoDB</title>
  17700.      <link>https://tech.nextroll.com/blog/ops/2013/10/02/dynamodb-replication.html</link>
  17701.      <pubDate>Wed, 02 Oct 2013 00:00:00 -0700</pubDate>
  17702.      <author></author>
  17703.      <guid isPermaLink="false">https://tech.nextroll.com/blog/ops/2013/10/02/dynamodb-replication</guid>
  17704.      <description>&lt;p&gt;In rolling out our real time bidding infrastructure, we were faced with
  17705. the task of syncing data for every user we could possibly target across
  17706. four regions.  We have on the order of hundreds of millions of users,
  17707. and tens of thousands of writes per second.  Not only do we have to deal
  17708. with the daunting task of writing this data out in real time, the
  17709. bidding system has a hard cap of 100ms for every bid request, so we need
  17710. strong guarantees on read performance.&lt;/p&gt;
  17711.  
  17712. &lt;p&gt;&lt;img src=&quot;/images/post_images/dynamodb-replication-cloudwatch.png&quot; alt=&quot;Cloudwatch&quot; /&gt;&lt;/p&gt;
  17713.  
  17714. &lt;p&gt;With DynamoDB we are able to have consistent, fast performance without
  17715. having to worry about scaling out our own database infrastructure.
  17716. However, DynamoDB does not support any concept of replication so we
  17717. needed to build a system to write data to multiple data centers.  Most
  17718. replication solutions generate a log of every write, and then sends that
  17719. log to all slaves.  In our case, our application servers generate a list
  17720. of writes that act as our source of truth, and then we can treat each
  17721. region’s DynamoDB table as a slave reading from those logs.&lt;/p&gt;
  17722.  
  17723. &lt;p&gt;The data we are storing is a mapping of user ids to segments, along with
  17724. a timestamp.  We can then look up which ad to serve from the list of
  17725. valid segments.&lt;/p&gt;
  17726.  
  17727. &lt;p&gt;&lt;img src=&quot;/images/post_images/dynamodb-replication-userlist.png&quot; alt=&quot;User Lists&quot; /&gt;&lt;/p&gt;
  17728.  
  17729. &lt;p&gt;The first replication solution was implemented using map reduce.
  17730. Periodically, a job would gather all the (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;userID&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;segmentID&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;timestamp&lt;/code&gt;)
  17731. tuples from logfiles, with an output format that wrote to every region
  17732. that needed this data.  As we scaled up to more and more users, this job
  17733. would take many hours to run.  Also, due to the nature of batch jobs,
  17734. there would be long idle periods, followed by large spikes in writes to
  17735. DynamoDB.  Since billing was based on throughput, we were interested in
  17736. a much more consistent write load.  We began searching for solutions to
  17737. speed this system up.&lt;/p&gt;
  17738.  
  17739. &lt;p&gt;We have been moving away from batch jobs towards more realtime
  17740. processing systems using &lt;a href=&quot;http://storm-project.net&quot;&gt;Storm&lt;/a&gt;.  The process for converting the batch
  17741. job to Storm was very straight-forward.  We already had a system for
  17742. ingesting our data into Storm; we just had to translate the data into
  17743. partial updates to update the time for a particular segment.&lt;/p&gt;
  17744.  
  17745. &lt;p&gt;This scheme ran into some performance issues.  In the map reduce job, we
  17746. were able to combine all the updates per user over a couple hour time
  17747. period into a single update.  However, with Storm we were issuing them
  17748. one at a time, and suffering from the large overhead of each http
  17749. request.  Since most of the time we were idle waiting for requests, the
  17750. first performance enhancement was to switch to the asynchronous Java
  17751. API.  We were then able to issue more requests in parallel, but it still
  17752. was not able to get the throughput we expected.&lt;/p&gt;
  17753.  
  17754. &lt;p&gt;We really wanted to use batching, but the API does not support batch
  17755. updates, only batch puts.  This required a slight change to our schema.&lt;/p&gt;
  17756.  
  17757. &lt;p&gt;&lt;img src=&quot;/images/post_images/dynamodb-replication-userprofile.png&quot; alt=&quot;User Profiles&quot; /&gt;&lt;/p&gt;
  17758.  
  17759. &lt;p&gt;By using range keys each row is independent of the others.  Updates can
  17760. be done using the batch API, which lets us send 25 updates at a time,
  17761. effectively multiplying the throughput by that amount.  Reads then just
  17762. have to query by user id with no range key, which will fetch all the
  17763. same data as the old schema.&lt;/p&gt;
  17764.  
  17765. &lt;p&gt;One important thing to keep in mind is that the batch API can be
  17766. partially successful, and it returns the subset of write requests that
  17767. failed.  And since each region has its own request, a given write may
  17768. succeed in one region and not the other.  Our writes can safely be
  17769. repeated, so we err on the side of replaying a given write, which fits
  17770. well with Storm’s promise of at-least-once processing.  We maintain a
  17771. list of all writes that are in flight, along with the number of regions
  17772. it has successfully been written to.  When a request returns, that count
  17773. is incremented for all successful values.  Once that count equals the
  17774. number of expected regions written to, it is acknowledged in Storm as
  17775. successful.  Any failed values get retried a couple of times within the
  17776. same process.  After the retries are finished, if there are any failed
  17777. writes left, we mark it as failed in Storm, which will cause it to be
  17778. put in the back of the write queue to be retried later.  As most
  17779. failures are temporal, generally due to throughput limit throttling,
  17780. this process can repeat until all writes complete successfully.&lt;/p&gt;
  17781.  
  17782. &lt;p&gt;While DynamoDB does not inherently support multiple regions, given a
  17783. stream of writes it’s relatively easy to send those data across the
  17784. world.  Combining batch, asynchronous writes with the speed of DynamoDB
  17785. yields us an average of over ten thousand writes per second on a single
  17786. Storm worker, with plenty of room to handle spikes.  Add to that linear
  17787. scaling of throughput with the number of workers, and we have a system that will
  17788. handle any load we can throw at it.&lt;/p&gt;
  17789.  
  17790. </description>
  17791.    </item>
  17792.    
  17793.    
  17794.    
  17795.    <item>
  17796.      <title>DNS-less setup in AWS</title>
  17797.      <link>https://tech.nextroll.com/blog/ops/2013/09/30/dns-less-setup-in-aws.html</link>
  17798.      <pubDate>Mon, 30 Sep 2013 00:00:00 -0700</pubDate>
  17799.      <author></author>
  17800.      <guid isPermaLink="false">https://tech.nextroll.com/blog/ops/2013/09/30/dns-less-setup-in-aws</guid>
  17801.      <description>&lt;p&gt;Anybody that has worked with &lt;a href=&quot;http://aws.amazon.com/&quot;&gt;AWS&lt;/a&gt; must deal
  17802. with the arbitrary hostnames that AWS generates for each instance. These
  17803. names are difficult to remember and not particularly useful to
  17804. understand what tasks any particular node is assigned to.&lt;/p&gt;
  17805.  
  17806. &lt;p&gt;The classic solution to this problem is to use a DNS server, but
  17807. unfortunately setting up and maintaining a DNS system can be a pain and,
  17808. when you are a quickly moving startup, it’s a lot easier to spend your
  17809. time maintaining only the bare necessities. Of course eventually you’ll
  17810. need a proper setup, but until then, there are plenty of work arounds that
  17811. do their job until you’re big enough to have more time to refine your
  17812. infrastructure.&lt;/p&gt;
  17813.  
  17814. &lt;p&gt;Setting up a Private DNS is &lt;a href=&quot;http://security.stackexchange.com/questions/39504/pros-cons-private-dns-vs-public-dns&quot;&gt;not a simple task&lt;/a&gt;.
  17815. Using Amazon Route53 could be an option, but it &lt;a href=&quot;http://aws.amazon.com/route53/faqs/#Manage_private_IP_address&quot;&gt;doesn’t manage private
  17816. ip addresses&lt;/a&gt; any
  17817. differently so, unless you want to leak your private addresses to the
  17818. outside world, you’ll have to set up your own DNS. In addition to all of
  17819. this, although there are many options for cheap private DNS service,
  17820. nothing is cheaper than free.&lt;/p&gt;
  17821.  
  17822. &lt;p&gt;Usually you live through this by setting up entries in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/etc/hosts&lt;/code&gt;
  17823. file but a great advantage of the cloud is that instances come up and
  17824. go down and scale up and down. Manually changing all of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/etc/hosts&lt;/code&gt;
  17825. or pre-configuring them is hard and time consuming, and worst of all
  17826. it’s hard to change in case of failures.&lt;/p&gt;
  17827.  
  17828. &lt;p&gt;We’ve tried to work around the immediate necessity for a DNS server
  17829. while still obtaining all of its benefits by using tags in AWS. The
  17830. system works in two parts: on one side you set tags like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dns: HOSTNAME&lt;/code&gt;
  17831. on every machine that you want to reach with a simple hostname, on the
  17832. other side you need a simple script that can build a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/etc/hosts&lt;/code&gt; file
  17833. by using the information obtained from the EC2 Tags service.&lt;/p&gt;
  17834.  
  17835. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;c1&quot;&gt;#!/usr/bin/env python
  17836. &lt;/span&gt;
  17837. &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;
  17838. &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;optparse&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OptionParser&lt;/span&gt;
  17839.  
  17840. &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;boto&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ec2&lt;/span&gt;
  17841. &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;boto.s3.connection&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;S3Connection&lt;/span&gt;
  17842. &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;boto.s3.key&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Key&lt;/span&gt;
  17843.  
  17844. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;put_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  17845.    &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17846.    &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key_name&lt;/span&gt;
  17847.    &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_contents_from_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17848.                               &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Content-Type&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;text/plain&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
  17849.                               &lt;span class=&quot;n&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17850.  
  17851. &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  17852.    &lt;span class=&quot;n&quot;&gt;aws_api_key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aws_access_key&lt;/span&gt;
  17853.    &lt;span class=&quot;n&quot;&gt;aws_secret_key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aws_secret_key&lt;/span&gt;
  17854.    &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;environ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;AWS_ACCESS_KEY_ID&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aws_access_key&lt;/span&gt;
  17855.    &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;environ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;AWS_SECRET_ACCESS_KEY&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aws_secret_key&lt;/span&gt;
  17856.    &lt;span class=&quot;n&quot;&gt;regions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;us-west-1&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;us-west-2&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;us-east-1&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;eu-west-1&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;ap-southeast-1&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  17857.  
  17858.    &lt;span class=&quot;n&quot;&gt;s3conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;S3Connection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aws_access_key_id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aws_api_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17859.                        &lt;span class=&quot;n&quot;&gt;aws_secret_access_key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aws_secret_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17860.    &lt;span class=&quot;n&quot;&gt;bucket&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s3conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;my-configuration-bucket&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17861.  
  17862.    &lt;span class=&quot;n&quot;&gt;region_instances&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
  17863.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;region&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17864.        &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ec2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect_to_region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17865.        &lt;span class=&quot;n&quot;&gt;all_reservations&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_all_instances&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filters&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;tag-key&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;dns&apos;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  17866.  
  17867.        &lt;span class=&quot;n&quot;&gt;instances&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
  17868.        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reservation&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;all_reservations&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17869.            &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reservation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instances&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17870.                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;running&quot;&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;dns&apos;&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tags&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17871.                    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dns_name&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tags&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;dns&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;,&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
  17872.                        &lt;span class=&quot;n&quot;&gt;instances&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dns_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt;
  17873.  
  17874.        &lt;span class=&quot;n&quot;&gt;region_instances&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instances&lt;/span&gt;
  17875.  
  17876.    &lt;span class=&quot;c1&quot;&gt;#Generate an etc host for each region, local instances using 10. and others using public
  17877. &lt;/span&gt;    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dest_region&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17878.        &lt;span class=&quot;n&quot;&gt;hosts_lines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
  17879.            &lt;span class=&quot;s&quot;&gt;&apos;127.0.0.1 localhost&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17880.            &lt;span class=&quot;s&quot;&gt;&apos;::1 ip6-localhost ip6-loopback&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17881.            &lt;span class=&quot;s&quot;&gt;&apos;fe00::0 ip6-localnet&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17882.            &lt;span class=&quot;s&quot;&gt;&apos;ff00::0 ip6-mcastprefix&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17883.            &lt;span class=&quot;s&quot;&gt;&apos;ff02::1 ip6-allnodes&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17884.            &lt;span class=&quot;s&quot;&gt;&apos;ff02::2 ip6-allrouters&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17885.            &lt;span class=&quot;s&quot;&gt;&apos;ff02::3 ip6-allhosts&apos;&lt;/span&gt;
  17886.        &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  17887.        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src_region&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17888.            &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dns_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;region_instances&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src_region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iteritems&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
  17889.                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src_region&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dest_region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17890.                    &lt;span class=&quot;n&quot;&gt;hosts_lines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%s %s.internal %s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;private_ip_address&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dns_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dns_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  17891.                &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17892.                    &lt;span class=&quot;n&quot;&gt;hosts_lines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%s %s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ip_address&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dns_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  17893.  
  17894.        &lt;span class=&quot;c1&quot;&gt;# make sure we replace hosts file as atomically as possible.
  17895. &lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;etc_hosts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;#! /bin/bash&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;echo &quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hosts_lines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot; | tee /tmp/etc_hosts &amp;amp;&amp;amp; cp /tmp/etc_hosts /etc/hosts&apos;&lt;/span&gt;
  17896.        &lt;span class=&quot;n&quot;&gt;put_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;etc_hosts/etc_hosts.%s.sh&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dest_region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;etc_hosts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17897.  
  17898.    &lt;span class=&quot;c1&quot;&gt;#Generate an etc hosts with all public ip addresses, for developers
  17899. &lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;hosts_lines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
  17900.    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src_region&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17901.        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dns_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;region_instances&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src_region&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iteritems&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
  17902.            &lt;span class=&quot;n&quot;&gt;hosts_lines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%s %s %s.internal&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ip_address&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dns_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dns_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  17903.  
  17904.    &lt;span class=&quot;n&quot;&gt;etc_hosts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hosts_lines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17905.    &lt;span class=&quot;n&quot;&gt;put_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;etc_hosts/etc_hosts.public&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;etc_hosts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17906.  
  17907. &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;__main__&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  17908.    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
  17909.    e.g, update_hosts.py --secret=&amp;lt;SECRET_KEY&amp;gt; --api=&amp;lt;API_KEY&amp;gt;
  17910.    &quot;&quot;&quot;&lt;/span&gt;
  17911.    &lt;span class=&quot;n&quot;&gt;usage&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;usage: %prog [options]&quot;&lt;/span&gt;
  17912.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OptionParser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;usage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;usage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17913.  
  17914.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;--secret&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;aws_secret_key&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17915.                        &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;environ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;AWS_SECRET_ACCESS_KEY&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;AWS Secret Access Key&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17916.    &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;--api&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;aws_access_key&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  17917.                        &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;environ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;AWS_ACCESS_KEY_ID&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;AWS Access Key&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  17918.    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parse_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  17919.  
  17920.    &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  17921.  
  17922. &lt;p&gt;This simple script will create and upload to S3 multiple versions of
  17923. your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/etc/hosts&lt;/code&gt; file to accomodate for all the different regions that
  17924. your infrastructure could be running on. In each regional hosts file
  17925. there will be private addresses for local instances and public addresses
  17926. for remote ones, but in all cases always accessible through the same
  17927. simple name.&lt;/p&gt;
  17928.  
  17929. &lt;p&gt;At this point all of your instances can just download the script from S3
  17930. and execute it to obtain a proper and up-to-date version of the
  17931. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/etc/hosts&lt;/code&gt; file.&lt;/p&gt;
  17932.  
  17933. &lt;p&gt;In addition, to simplify development, there is also a fully public
  17934. version of the hosts file. By using this version of the hosts and
  17935. loading it on your computer you will now be able to access any machine
  17936. of your cluster without needing to lookup their AWS hostname.&lt;/p&gt;
  17937.  
  17938. &lt;p&gt;Of course there are a number of alternatives to solve this issue. By
  17939. using this this trick at AdRoll, we manage our current installation with
  17940. more than one hundred instances without many headaches.&lt;/p&gt;
  17941.  
  17942. &lt;p&gt;Soon we’ll need to look into a proper DNS server but until then…&lt;/p&gt;
  17943. </description>
  17944.    </item>
  17945.    
  17946.    
  17947.    
  17948.    <item>
  17949.      <title>Unexpected behavior with jQuery</title>
  17950.      <link>https://tech.nextroll.com/blog/web/2013/07/22/playing-with-jquery-append.html</link>
  17951.      <pubDate>Mon, 22 Jul 2013 00:00:00 -0700</pubDate>
  17952.      <author></author>
  17953.      <guid isPermaLink="false">https://tech.nextroll.com/blog/web/2013/07/22/playing-with-jquery-append</guid>
  17954.      <description>&lt;p&gt;As perfect as jQuery may seem, sometimes its default behaviors end up
  17955. being very surprising to the uninitiated.&lt;/p&gt;
  17956.  
  17957. &lt;p&gt;During a routine code review, these few lines were pointed out as being
  17958. weird because, even though there are 2 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;append()&lt;/code&gt; calls, nothing was
  17959. evidently wrong from the UI.&lt;/p&gt;
  17960.  
  17961. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;summaryBar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  17962.    &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;.slot-one&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;summaryBar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;el&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;campaignGraph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;el&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  17963. &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  17964.  
  17965. &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;.slot-one&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;campaignGraph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;el&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  17966.  
  17967. &lt;p&gt;(NB: we’re using &lt;a href=&quot;http://backbonejs.org/&quot;&gt;Backbone.js&lt;/a&gt; and these are Backbone views, so &lt;a href=&quot;http://backbonejs.org/#View-el&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.el&lt;/code&gt;&lt;/a&gt; represents the DOM element for these views)&lt;/p&gt;
  17968.  
  17969. &lt;p&gt;Dan wrote:&lt;/p&gt;
  17970. &lt;blockquote&gt;
  17971.  &lt;p&gt;will this append the campaignGraph twice in the case that there is a summary bar?&lt;/p&gt;
  17972. &lt;/blockquote&gt;
  17973.  
  17974. &lt;p&gt;Reading the code, it seemed to be that way but running the code, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;campaignGraph&lt;/code&gt; was added only once.&lt;/p&gt;
  17975.  
  17976. &lt;p&gt;This called for some digging.&lt;/p&gt;
  17977.  
  17978. &lt;hr /&gt;
  17979.  
  17980. &lt;p&gt;The reason is actually &lt;a href=&quot;http://api.jquery.com/append/&quot;&gt;documented&lt;/a&gt;:&lt;/p&gt;
  17981.  
  17982. &lt;blockquote&gt;
  17983.  &lt;p&gt;You can also select an element on the page and insert it into another:&lt;/p&gt;
  17984.  
  17985.  &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$(&apos;.container&apos;).append($(&apos;h2&apos;));&lt;/code&gt;&lt;/p&gt;
  17986.  
  17987.  &lt;p&gt;If an element selected this way is inserted into a single location elsewhere in the DOM, it will be moved into the target (not cloned).&lt;/p&gt;
  17988.  
  17989.  &lt;p&gt;If there is more than one target element, however, cloned copies of the inserted element will be created for each target after the first.&lt;/p&gt;
  17990. &lt;/blockquote&gt;
  17991.  
  17992. &lt;p&gt;This actually makes some sense: an element is passed, so why not use it. But when called on multiple elements, we need to clone the argument to append it to multiple DOM locations.&lt;/p&gt;
  17993.  
  17994. &lt;p&gt;However, because of that, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;append()&lt;/code&gt; doesn’t behave like you would expect when called multiple times.&lt;/p&gt;
  17995.  
  17996. &lt;p&gt;Take this HTML:&lt;/p&gt;
  17997.  
  17998. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-html&quot; data-lang=&quot;html&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;ul&amp;gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  17999.  
  18000. &lt;p&gt;What do you expect this JavaScript to do:&lt;/p&gt;
  18001.  
  18002. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;li&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;lt;li&amp;gt;Hello&amp;lt;/li&amp;gt;&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  18003. &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;ul&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;li&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  18004. &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;ul&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;li&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  18005. &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;ul&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;li&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  18006. &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;ul&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;li&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  18007.  
  18008. &lt;p&gt;Until this morning, I would have expected to get four &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;li&amp;gt;&lt;/code&gt; elements inside the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;ul&amp;gt;&lt;/code&gt; but it will have only the one I created since it’s never cloned:&lt;/p&gt;
  18009.  
  18010. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-html&quot; data-lang=&quot;html&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;ul&amp;gt;&lt;/span&gt;
  18011.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18012. &lt;span class=&quot;nt&quot;&gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  18013.  
  18014. &lt;p&gt;Even though it follows the documentation (&lt;a href=&quot;#fn1&quot;&gt;mostly&lt;/a&gt;), the behavior is more unexpected when your selector matches multiple elements.&lt;/p&gt;
  18015.  
  18016. &lt;p&gt;Let’s use the following HTML instead:&lt;/p&gt;
  18017.  
  18018. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-html&quot; data-lang=&quot;html&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;ul&amp;gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;
  18019. &lt;span class=&quot;nt&quot;&gt;&amp;lt;ul&amp;gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;
  18020. &lt;span class=&quot;nt&quot;&gt;&amp;lt;ul&amp;gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  18021.  
  18022. &lt;p&gt;With the same JavaScript, the result is:&lt;/p&gt;
  18023.  
  18024. &lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-html&quot; data-lang=&quot;html&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;ul&amp;gt;&lt;/span&gt;
  18025.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18026.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18027.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18028.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18029. &lt;span class=&quot;nt&quot;&gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;
  18030. &lt;span class=&quot;nt&quot;&gt;&amp;lt;ul&amp;gt;&lt;/span&gt;
  18031.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18032.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18033.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18034.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18035. &lt;span class=&quot;nt&quot;&gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;
  18036. &lt;span class=&quot;nt&quot;&gt;&amp;lt;ul&amp;gt;&lt;/span&gt;
  18037.    &lt;span class=&quot;nt&quot;&gt;&amp;lt;li&amp;gt;&lt;/span&gt;Hello&lt;span class=&quot;nt&quot;&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
  18038. &lt;span class=&quot;nt&quot;&gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
  18039.  
  18040. &lt;p&gt;I followed the tracks of this behavior in &lt;a href=&quot;http://bugs.jquery.com/ticket/8070&quot;&gt;jQuery’s bug tracker&lt;/a&gt; and &lt;a href=&quot;https://github.com/jquery/jquery/commit/0a0cff9d29c4bd559da689d96c532d06c03fce09/#L0R319&quot;&gt;commit log&lt;/a&gt; and it dates from about two years ago. The reason is to avoid memory leaks. Sadly, the current side-effect is unintuitive.&lt;/p&gt;
  18041.  
  18042. &lt;p&gt;Here is a pen you can play with:&lt;/p&gt;
  18043.  
  18044. &lt;blockquote&gt;
  18045.  &lt;p data-height=&quot;402&quot; data-theme-id=&quot;348&quot; data-slug-hash=&quot;zmonw&quot; data-user=&quot;adrolldev&quot; data-default-tab=&quot;result&quot; class=&quot;codepen&quot;&gt;See the Pen &lt;a href=&quot;http://codepen.io/adrolldev/pen/zmonw&quot;&gt;jQuery.append() unexpected behavior&lt;/a&gt; by AdRoll Dev (&lt;a href=&quot;http://codepen.io/adrolldev&quot;&gt;@adrolldev&lt;/a&gt;) on &lt;a href=&quot;http://codepen.io&quot;&gt;CodePen&lt;/a&gt;&lt;/p&gt;
  18046. &lt;/blockquote&gt;
  18047.  
  18048. &lt;p id=&quot;fn1&quot; class=&quot;footnote&quot;&gt;1. The documentation is incorrect since it&apos;s actually the last element, not the first that ends up with the original element. But the main idea remains. (as a good open-source citizen, I submitted an edit to the documentation)&lt;/p&gt;
  18049.  
  18050. &lt;p&gt;Photo credit: &lt;a href=&quot;http://www.flickr.com/photos/eschipul/&quot;&gt;Ed Schipul&lt;/a&gt;&lt;/p&gt;
  18051. </description>
  18052.    </item>
  18053.    
  18054.    
  18055.    
  18056.    <item>
  18057.      <title>HyperLogLog and MinHash</title>
  18058.      <link>https://tech.nextroll.com/blog/data/2013/07/10/hll-minhash.html</link>
  18059.      <pubDate>Wed, 10 Jul 2013 00:00:00 -0700</pubDate>
  18060.      <author></author>
  18061.      <guid isPermaLink="false">https://tech.nextroll.com/blog/data/2013/07/10/hll-minhash</guid>
  18062.      <description>&lt;p&gt;At AdRoll, we have a lot of data to deal with.  As we keep accumulating all of this data, our scaling issues become
  18063. more complicated, and even something as simple as counting becomes a bit of a chore.  After using Bloom filters to
  18064. count uniques, we eventually wanted to find something more space efficient.&lt;/p&gt;
  18065.  
  18066. &lt;p&gt;We started researching, and
  18067. implemented a form of HyperLogLog, which gives us the ability count uniques with good accuracy, do it in a
  18068. distributed way, and keep our memory and storage requirements down. One of the drawbacks though, was that we
  18069. couldn’t take intersections with these structures, and so we tacked on the additional structure of MinHash.&lt;/p&gt;
  18070.  
  18071. &lt;p&gt;This blog post briefly covers our research.  For the full, gory mathematical details, I also wrote a
  18072. &lt;a href=&quot;/media/hllminhash.pdf&quot;&gt;much more thorough, yet casual, paper&lt;/a&gt;, with way more graphs and analysis.  If that’s what you’re
  18073. looking for, go straight to the paper: there’s nothing in the blog post that isn’t covered in the paper.&lt;/p&gt;
  18074.  
  18075. &lt;h3 id=&quot;hyperloglog&quot;&gt;HyperLogLog++&lt;/h3&gt;
  18076.  
  18077. &lt;p&gt;The first stop we made was the Google paper &lt;a href=&quot;http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/40671.pdf&quot;&gt;“HyperLogLog in Practice: Algorithmic Engineering of a State of The Art
  18078. Cardinality Estimation Algorithm”&lt;/a&gt;
  18079. by Heule, Nunkesser, and Hall.  It describes some improvements to the HyperLogLog algorithm, which the authors
  18080. now call HyperLogLog++.  Double double.&lt;/p&gt;
  18081.  
  18082. &lt;p&gt;Essentially, it works like this: When you go to add an element to the structure, you hash it into 64 bits.  From there,
  18083. you divide up the hash as:&lt;/p&gt;
  18084.  
  18085. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  18086. %&lt;![CDATA[
  18087. \overbrace{101\ldots 011}^{p\ \rm{bits}}\overbrace{001\ldots 110}^{64 - p\ \rm{bits}}
  18088. %]]&gt;
  18089. &lt;/script&gt;
  18090.  
  18091. &lt;p&gt;Where \(p\) is your defined precision.  The first chunk defines the index \(i\) into an array \(M\) which has \(2^p\) bins,
  18092. all initialized to zero.  Now, for the second chunk, count the number of leading zeros plus one (in this case, three),
  18093. which we’ll call \(z\).  Then you set \(M[i] = \max\left\{M[i], z\right\}\).  By the end, you have a bunch of bins with
  18094. a bunch of numbers in them.  Now what?&lt;/p&gt;
  18095.  
  18096. &lt;p&gt;Well, it turns out that you can get an estimate of how many unique elements were added to this structure.  And the
  18097. way you do this is:&lt;/p&gt;
  18098.  
  18099. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  18100. %&lt;![CDATA[
  18101. E = 2^{2p} \left(2^p \int_{0}^{\infty} \left( \log_2 \left( \frac{2 + u}{1 + u} \right) \right)^{2^p}
  18102. {\rm{d}}u\right)^{-1} \left( \sum_{i=0}^{2^p - 1} 2^{-M[i]} \right)^{-1}
  18103. %]]&gt;
  18104. &lt;/script&gt;
  18105.  
  18106. &lt;p&gt;Pretty nasty, but that second term converges.  The Google paper recommends:&lt;/p&gt;
  18107.  
  18108. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  18109. %&lt;![CDATA[
  18110. \begin{matrix}
  18111. p &amp; &amp; \rm{second\ term} \\
  18112. 4 &amp; &amp; 0.673 \\
  18113. 5 &amp; &amp; 0.697 \\
  18114. 6 &amp; &amp; 0.709 \\
  18115. &gt; 7 &amp; &amp; 0.7213/(1 + 1.079/2^p)
  18116. \end{matrix}
  18117. %]]&gt;
  18118. &lt;/script&gt;
  18119.  
  18120. &lt;p&gt;And we’re still not done. If there are any zeros left in \(M\), we count them (call it \(V\)) and output
  18121. \(E = 2^p \log\left(\frac{2^p}{V}\right)\).  And if \(E &amp;lt; 5 \cdot 2^p\), then we go through a bias correction step; the
  18122. Google paper has an addendum of empirically derived values for this.  Otherwise, we can just output \(E\).&lt;/p&gt;
  18123.  
  18124. &lt;p&gt;The Google paper has some other enhancements as well, such as a sparse representation, but that’s beyond the scope
  18125. of this blog post.  Anyway, this turns out to work pretty well.  Here’s a graph of relative errors up to a
  18126. cardinality of 200 million, run 128 times for each cardinality tested, for \(p = 18\) with \(3\sigma\) ranges overlaid:&lt;/p&gt;
  18127.  
  18128. &lt;p&gt;&lt;img src=&quot;/images/post_images/hll-perf.png&quot; alt=&quot;hll-perf&quot; /&gt;&lt;/p&gt;
  18129.  
  18130. &lt;p&gt;One of the really nice things about these HyperLogLog structures is that you can get unions for free; with two
  18131. such structures, \(M_1\) and \(M_2\), you just need \(M[i] = \max\left\{M_1[i], M_2[i]\right\}\). This means that we can
  18132. tally up uniques in a distributed computing environment.  It’s also possible to reduce the precision of a HyperLogLog
  18133. structure if you need to union two of them that have different array sizes, but you can check out the full paper for
  18134. those details.&lt;/p&gt;
  18135.  
  18136. &lt;p&gt;Intersections are a different story.  Some other blogs suggest using the &lt;a href=&quot;http://en.wikipedia.org/wiki/Inclusion-exclusion_principle&quot;&gt;inclusion-exclusion principle&lt;/a&gt;,
  18137. but this can get somewhat unwieldy, and potentially give some &lt;a href=&quot;http://blog.aggregateknowledge.com/2012/12/17/hll-intersections-2/&quot;&gt;wild results&lt;/a&gt;.
  18138. At AdRoll, we decided that we were willing to sacrifice some space efficiency by effectively tacking on a completely
  18139. separate structure.&lt;/p&gt;
  18140.  
  18141. &lt;h3 id=&quot;minhash&quot;&gt;MinHash&lt;/h3&gt;
  18142.  
  18143. &lt;p&gt;MinHash is an algorithm that computes an estimate of something called the Jaccard Index, which is defined as:&lt;/p&gt;
  18144.  
  18145. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  18146. %&lt;![CDATA[
  18147. J(A,B) = \frac{|A \cap B|}{|A \cup B|}
  18148. %]]&gt;
  18149. &lt;/script&gt;
  18150.  
  18151. &lt;p&gt;It functions as a measure of how similar two sets are.  But let’s abandon this notion of similarity, and extend the
  18152. metric, so that we can talk about interesecting lots of sets:&lt;/p&gt;
  18153.  
  18154. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  18155. %&lt;![CDATA[
  18156. J(A_1, A_2,\ldots,A_n) = \frac{\left|\bigcap_{i=1}^{n}A_i\right|}{\left|\bigcup_{i=1}^{n}A_i\right|}
  18157. %]]&gt;
  18158. &lt;/script&gt;
  18159.  
  18160. &lt;p&gt;The variant of MinHash that we’ll be using employs only one hash function, \(h(\cdot)\), and for each input set, we
  18161. keep the \(k\) lowest values after going through this hash function (I’ll be using the notation \(\min_k\) to represent
  18162. this, which overloads its usual mathematical meaning).  If our hash function is uniformly distributed and is not
  18163. prone to collisions, we can say that:&lt;/p&gt;
  18164.  
  18165. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  18166. %&lt;![CDATA[
  18167. \min_k\left\{\bigcup\min_k\left\{h\left(A_i\right)\right\}\right\} = \min_k\left\{h\left(\bigcup A_i\right)\right\}
  18168. %]]&gt;
  18169. &lt;/script&gt;
  18170.  
  18171. &lt;p&gt;is a random sample of our total space.  That means that we can check each element in this sample, and see if it is in
  18172. every \(\min_k\left\{h\left(A_i\right)\right\}\).  By tallying up all the elements that satisfy this criterion
  18173. (call it \(y\)), we can say:&lt;/p&gt;
  18174.  
  18175. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  18176. %&lt;![CDATA[
  18177. \frac{y}{k} \approx J(A_1,\ldots,A_n)
  18178. %]]&gt;
  18179. &lt;/script&gt;
  18180.  
  18181. &lt;p&gt;Now that we have this estimate, we know that:&lt;/p&gt;
  18182.  
  18183. &lt;script type=&quot;math/tex; mode=display&quot;&gt;
  18184. %&lt;![CDATA[
  18185. \left|\bigcap A_i \right| = J(A_1,\ldots,A_n)\cdot\left|\bigcup A_i\right| \approx
  18186. \rm{MinHash\ Result} \cdot \rm{HLL\ Result}
  18187. %]]&gt;
  18188. &lt;/script&gt;
  18189.  
  18190. &lt;p&gt;Hey, we’re hashing all this stuff with HyperLogLog++ anyway, right?
  18191. We tested this assertion with a HyperLogLog++ structure of \(p = 18\) and a MinHash \(k = 98304\), which in our implementation
  18192. adds up to a 1MB structure.  We took two randomly generated sets, each with 100 million elements, and then intersected them
  18193. for a variety of Jaccard Indices, repeating the process 128 times.  This is the result:&lt;/p&gt;
  18194.  
  18195. &lt;p&gt;&lt;img src=&quot;/images/post_images/mh-perf.png&quot; alt=&quot;mh-perf.png&quot; /&gt;&lt;/p&gt;
  18196.  
  18197. &lt;p&gt;The blue region represents our theoretical 99% confidence bounds (you can read about the theory in the paper, along with
  18198. how to pick a good \(k\) for your own purposes), and we see that our experimental results corroborate our predictions well.
  18199. You can see that the more the sets overlap, the better we are at predicting the size of that overlap.  However, even for
  18200. small intersections, we do pretty well: we can be 99% sure that if the overlap is 0.01%, we’ll report that between 0.00%
  18201. and 0.02%, which is quite reasonable.  A 1% overlap is 99% likely to be reported between 0.9195% 1.0824%.&lt;/p&gt;
  18202.  
  18203. &lt;h3 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h3&gt;
  18204.  
  18205. &lt;p&gt;If you’re already using HyperLogLog structures in your work, but have been longing for a
  18206. means of intersecting them, it’s worth checking out out extending your code with a MinHash
  18207. implementation.  While it does require extra processing power to deal with collecting all
  18208. the minima, it’s possible to get satisfactory performance out of the structure for a
  18209. relatively low storage or memory footprint.&lt;/p&gt;
  18210.  
  18211. &lt;p&gt;Last, as a final plug, I get to tackle interesting problems like this all the time throughout
  18212. the course of my work.  If this type of stuff fires you up, AdRoll is hiring, and we would love
  18213. to &lt;a href=&quot;https://www.adroll.com/about/careers&quot;&gt;hear from you&lt;/a&gt;.&lt;/p&gt;
  18214. </description>
  18215.    </item>
  18216.    
  18217.    
  18218.    
  18219.    <item>
  18220.      <title>Announcing Our Acquisition of Data Analytics Startup, Bitdeli</title>
  18221.      <link>https://tech.nextroll.com/blog/adroll/2013/06/12/bitdeli-acquisition.html</link>
  18222.      <pubDate>Wed, 12 Jun 2013 00:00:00 -0700</pubDate>
  18223.      <author></author>
  18224.      <guid isPermaLink="false">https://tech.nextroll.com/blog/adroll/2013/06/12/bitdeli-acquisition</guid>
  18225.      <description>&lt;p&gt;Over the last year, AdRoll’s mission of making retargeting easy and effective for everyone has really caught fire. Last month alone, we welcomed over 1,250 new advertisers to our platform. The secret to our success has always been layering the industry’s most intuitive tools over a powerful data backend to help companies collect, analyze and (most importantly) act on their customer data.&lt;/p&gt;
  18226.  
  18227. &lt;p&gt;Today, I’m excited to announce we’ve taken another big step forward in our mission with the acquisition of data analytics platform Bitdeli, and the addition of the company’s founders, the Tuulos brothers, to our engineering team.&lt;/p&gt;
  18228.  
  18229. &lt;blockquote&gt;
  18230.  &lt;p&gt;&lt;img src=&quot;/images/post_images/bitdeli_group_photo.jpg&quot; alt=&quot;BitDeli Group Photo&quot; /&gt;&lt;/p&gt;
  18231.  
  18232.  &lt;p&gt;From left to right: Valentino Volonghi, Aaron Bell, Ville Tuulos, Jyri Tuulos&lt;/p&gt;
  18233. &lt;/blockquote&gt;
  18234.  
  18235. &lt;p&gt;Ville Tuulos (brother #1) first popped onto our radar five years ago when our Chief Architect, Valentino, learned of Disco, the Erlang-based MapReduce framework that Ville created:&lt;/p&gt;
  18236.  
  18237. &lt;blockquote&gt;
  18238.  &lt;p&gt;&lt;img src=&quot;/images/post_images/disco.png&quot; alt=&quot;Disco - an Erlang-based MapReduce framework&quot; /&gt;&lt;/p&gt;
  18239. &lt;/blockquote&gt;
  18240.  
  18241. &lt;p&gt;Clearly, this epic “awesomeness night” set Valentino aflutter. Soon after, he struck up correspondence with Ville and became one of the first contributors to the Disco open source project. Years later, Ville and Jyri (brother #2) took their concept further and launched Bitdeli with a vision to democratize data and make it easy for anyone to build their own custom analytics.&lt;/p&gt;
  18242.  
  18243. &lt;p&gt;When we met Jyri, we quickly realized he was the perfect complement to Ville, and had the unique front-end skills to make complex data mining technology simple through innovative visualizations and interfaces.&lt;/p&gt;
  18244.  
  18245. &lt;p&gt;Once we all began talking, we realized we had matching world views and aspirations. It was a natural decision to join forces.&lt;/p&gt;
  18246.  
  18247. &lt;p&gt;Over the coming months, we will be rolling the Bitdeli technology into the AdRoll product. With this, our users will see:&lt;/p&gt;
  18248.  
  18249. &lt;blockquote&gt;
  18250.  &lt;ul&gt;
  18251.    &lt;li&gt;New, richer ways to slice data for analysis and targeting&lt;/li&gt;
  18252.    &lt;li&gt;New, beautiful visualizations of data&lt;/li&gt;
  18253.    &lt;li&gt;Greater transparency across all sorts of data sets&lt;/li&gt;
  18254.    &lt;li&gt;Proactive insights and suggestions&lt;/li&gt;
  18255.    &lt;li&gt;Smarter customer scoring and optimizations&lt;/li&gt;
  18256.  &lt;/ul&gt;
  18257. &lt;/blockquote&gt;
  18258.  
  18259. &lt;p&gt;As always, we’ll do our best to do the heavy lifting and make it quick-and-easy for you to run successful campaigns. While Bitdeli’s core product will be rolled into AdRoll, Bitdeli’s GitHub Badge, a free product that provides analytics specifically for GitHub users, will remain unchanged and continue to be maintained independently of AdRoll’s platform.&lt;/p&gt;
  18260.  
  18261. &lt;p&gt;Welcome aboard, Ville and Jyri!&lt;/p&gt;
  18262. </description>
  18263.    </item>
  18264.    
  18265.    
  18266.    
  18267.    <item>
  18268.      <title>Hankering for Some Hadoop?</title>
  18269.      <link>https://tech.nextroll.com/blog/events/2013/06/11/june-sf-hadoop-meetup.html</link>
  18270.      <pubDate>Tue, 11 Jun 2013 00:00:00 -0700</pubDate>
  18271.      <author></author>
  18272.      <guid isPermaLink="false">https://tech.nextroll.com/blog/events/2013/06/11/june-sf-hadoop-meetup</guid>
  18273.      <description>&lt;p&gt;Tomorrow (Wednesday, June 12th) from 6:00-8:00pm, we’ll be hosting a group of Hadoop users and developers here at our SOMA headquarters, ShangRolla, for &lt;a href=&quot;http://www.meetup.com/hadoopsf/events/120561212/&quot;&gt;their monthly meetup&lt;/a&gt;. Hadoop is an “open-source software for reliable, scalable, distributed computing” used by top tech companies like Yahoo!, Facebook, Amazon, Apple, Google and Microsoft. According to a recent &lt;a href=&quot;http://tdwi.org/articles/2013/04/23/hadoop-usage-exploding.aspx&quot;&gt;TDWI survey&lt;/a&gt;, three-quarters of respondents have either deployed or expect to deploy HDFS (Hadoop Distributed File System) in production.&lt;/p&gt;
  18274.  
  18275. &lt;p&gt;Our very own Derek Nelson will present on the “Real-World Use Cases of Hadoop/HBase Synergy” and will be followed by Christophe Taton of &lt;a href=&quot;http://www.wibidata.com/&quot;&gt;WibiData&lt;/a&gt;, for his presentation “Under the Hood: How KijiSchmea extends HBase.”&lt;/p&gt;
  18276.  
  18277. &lt;p&gt;This event follows a very successful, content-rich &lt;a href=&quot;http://www.meetup.com/hbaseusergroup/events/103587852/&quot;&gt;April meetup&lt;/a&gt; in which Derek Nelson joined &lt;a href=&quot;https://twitter.com/tsunanet&quot;&gt;Benoit Sigoure&lt;/a&gt; of Arista Networks, Francis Liu of Yahoo!, and &lt;a href=&quot;https://twitter.com/sershe84&quot;&gt;Sergey Shelukhin&lt;/a&gt; of HortonWorks, to share the Big Data challenges of the online advertising space and and discuss how HBase fits into our engineering toolkit.&lt;/p&gt;
  18278.  
  18279. &lt;p&gt;For Derek, advertising “combines big data, real-time processing, machine learning and money. It helps the internet be free and fluid, and it happens to be one of the most interesting places to sink your teeth into and build cool things.” Here at AdRoll, “we have all of our raw data being processed, aggregated, summarized and sliced-and-diced, in many, many different ways with Hadoop and all of that data is dumped into HBase.”&lt;/p&gt;
  18280.  
  18281. &lt;p&gt;Want more information on April’s meetup? Check out Derek’s slideshow &lt;a href=&quot;https://speakerdeck.com/dereknelson/rolling-with-hbase&quot;&gt;here&lt;/a&gt; or watch the recordings from all four of the April HBase Meetup speakers below:&lt;/p&gt;
  18282.  
  18283. &lt;div class=&quot;youtube-wrapper&quot;&gt;&lt;iframe class=&quot;youtube-embed&quot; width=&quot;574&quot; height=&quot;323&quot; src=&quot;http://www.youtube.com/embed/AYERq4pBRqQ&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;
  18284. </description>
  18285.    </item>
  18286.    
  18287.    
  18288.    
  18289.    <item>
  18290.      <title>The Secrets to Startup Staffing Success</title>
  18291.      <link>https://tech.nextroll.com/blog/careers/2013/03/11/secrets-to-startup-staffing-success.html</link>
  18292.      <pubDate>Mon, 11 Mar 2013 00:00:00 -0700</pubDate>
  18293.      <author></author>
  18294.      <guid isPermaLink="false">https://tech.nextroll.com/blog/careers/2013/03/11/secrets-to-startup-staffing-success</guid>
  18295.      <description>&lt;p&gt;Throughout my career, I have had the honor to work with innovative, fast-paced and industry-disrupting companies that have attracted some of the best talent around the world. When I was given the opportunity to build AdRoll’s internal recruitment machine, I jumped at this chance last December. &lt;a href=&quot;http://blog.adroll.com/inc-500-fastest-growing-advertising-company&quot;&gt;Named the fastest growing advertising company in 2012 by Inc. Magazine&lt;/a&gt; and already at over 100 employees when I started, we were on the right track to creating a best-in-class staffing initiative.&lt;/p&gt;
  18296.  
  18297. &lt;p&gt;It’s now halfway into the first quarter of 2013 and things are moving even faster. We are expected to double in size and, having just opened a New York office, expand in other locations. Though we’re growing at a breakneck pace, we’ve been uncompromising when it comes to bringing on top talent. Over the past few months, I’ve had many colleagues in the industry ask me: how did AdRoll grow so fast? How are you going to scale this new growth?&lt;/p&gt;
  18298.  
  18299. &lt;p&gt;In my experience, building a strong internal team, company culture and brand, and an employee referral program are key areas that companies have to focus on in order to build an effective staffing initiative. Also, having leaders that are on board with these areas is crucial to success.&lt;/p&gt;
  18300.  
  18301. &lt;blockquote&gt;
  18302.  &lt;ul&gt;
  18303.    &lt;li&gt;
  18304.      &lt;p&gt;&lt;b&gt;Internal Team:&lt;/b&gt; a strong internal team is going to be the pillar to your company staffing initiative success. They are going to partner with you to drive a variety of initiatives, from developing a great candidate experience to sourcing to events, etc.&lt;/p&gt;
  18305.    &lt;/li&gt;
  18306.    &lt;li&gt;
  18307.      &lt;p&gt;&lt;b&gt;Culture:&lt;/b&gt; Culture fit is a very important piece and can be a challenge in hiring – it’s often the differentiating factor in a candidate’s decision-making process. This also contributes to companies making very costly hiring mistakes. In my experience recruiting in the U.S. and around the world, culture is something that you just can’t compromise. If you have a great culture, the other pieces will most likely fall into place. &lt;/p&gt;
  18308.    &lt;/li&gt;
  18309.    &lt;li&gt;
  18310.      &lt;p&gt;&lt;b&gt;Brand:&lt;/b&gt; This goes hand-in-hand with a great company culture. Your employee brand relates directly to how you develop your internal culture. How you present your company on the web and in person will impact the employment brand image. Word spreads fast and a few negative remarks can really do damage.&lt;/p&gt;
  18311.    &lt;/li&gt;
  18312.    &lt;li&gt;
  18313.      &lt;p&gt;&lt;b&gt;Employee Referrals:&lt;/b&gt; Great people know great people and your company should have a “everyone recruits” culture. Great employee referrals are the backbone and often critical to a company’s overall success.&lt;/p&gt;
  18314.    &lt;/li&gt;
  18315.  &lt;/ul&gt;
  18316. &lt;/blockquote&gt;
  18317.  
  18318. &lt;p&gt;Interested in Rollin’ with us? &lt;a href=&quot;http://www.adroll.com/about/careers&quot;&gt;Check out our careers page today&lt;/a&gt;!&lt;/p&gt;
  18319.  
  18320. &lt;center&gt;
  18321. &lt;img src=&quot;/images/post_images/secrets_to_startup_staffing.jpg&quot; /&gt;
  18322. &lt;/center&gt;
  18323. </description>
  18324.    </item>
  18325.    
  18326.    
  18327.    
  18328.    <item>
  18329.      <title>Out With the Old, In With the New Dashboard</title>
  18330.      <link>https://tech.nextroll.com/blog/dev/2013/01/29/new-dashboard.html</link>
  18331.      <pubDate>Tue, 29 Jan 2013 00:00:00 -0800</pubDate>
  18332.      <author></author>
  18333.      <guid isPermaLink="false">https://tech.nextroll.com/blog/dev/2013/01/29/new-dashboard</guid>
  18334.      <description>&lt;p&gt;We’ve been really busy over here at &lt;a href=&quot;http://blog.adroll.com/adrolls-hq-shangrolla&quot;&gt;ShangRolla&lt;/a&gt;. One of the largest projects that we’ve tackled is a complete redesign of the interface used to set up, measure, and manage your AdRoll campaigns. We’re excited to offer you access, and we hope that you’ll take it for a spin.&lt;/p&gt;
  18335.  
  18336. &lt;center&gt;
  18337. &lt;img alt=&quot;New Dashboard Screenshot&quot; src=&quot;/images/post_images/new_dashboard_1.png&quot; /&gt;
  18338. &lt;/center&gt;
  18339.  
  18340. &lt;p&gt;This version is just the beginning. It’s in beta, so it’s a work in progress, and there could be an occasional bug. But even in its early state, we believe that it will improve your AdRoll experience and makes it even easier to run incredibly effective campaigns. Over the next few weeks, we’ll be adding a ton of new features, many of which you’ve requested. We’ve also made it very easy to switch back and forth between the old and new dashboard.&lt;/p&gt;
  18341.  
  18342. &lt;center&gt;
  18343. &lt;img alt=&quot;New Dashboard Screenshot&quot; src=&quot;/images/post_images/new_dashboard_2.png&quot; /&gt;
  18344. &lt;/center&gt;
  18345.  
  18346. &lt;p&gt;At AdRoll, we truly value feedback. We could not have done this without the customers who offered up great ideas and constructive ways to improve our product. Ultimately we want to hear what you think. The more you speak up, the better we can make this. So use the feedback forum in the footer of the new dashboard. Let us know if you find bugs. Make feature requests.  And if you’ve got some great ideas, we’d love to speak to you directly. Just email beta@adroll.com and we’ll schedule a time to chat.&lt;/p&gt;
  18347.  
  18348. &lt;h3 id=&quot;a-few-highlights&quot;&gt;A Few Highlights:&lt;/h3&gt;
  18349.  
  18350. &lt;blockquote&gt;
  18351.  &lt;p&gt;&lt;b&gt;1. Self service Facebook Retargeting:&lt;/b&gt; Easily retarget your customers on the new Facebook Exchange.&lt;/p&gt;
  18352.  &lt;p&gt;&lt;img src=&quot;/images/post_images/self_service_facebook_retargeting.png&quot; alt=&quot;Self Service Facebook Retargeting&quot; /&gt;&lt;/p&gt;
  18353.  
  18354.  &lt;p&gt;&lt;b&gt;2. Multi-ad upload:&lt;/b&gt; drag and drop multiple files or browse to select multiple ads at a time.&lt;/p&gt;
  18355.  &lt;p&gt;&lt;img src=&quot;/images/post_images/multi-ad-upload.png&quot; alt=&quot;Multi-ad Upload&quot; /&gt;&lt;/p&gt;
  18356.  
  18357.  &lt;p&gt;&lt;b&gt;3. Simplified campaign setup and editing&lt;/b&gt;&lt;/p&gt;
  18358.  &lt;p&gt;&lt;img src=&quot;/images/post_images/simplified_campaign_setup_and_editing.png&quot; alt=&quot;Simplified Campaign Setup and Editing&quot; /&gt;&lt;/p&gt;
  18359.  
  18360.  &lt;p&gt;&lt;b&gt;4. Improved creative management tools&lt;/b&gt;&lt;/p&gt;
  18361.  &lt;p&gt;&lt;img src=&quot;/images/post_images/improved_creative_management.png&quot; alt=&quot;Improved Creative Management Tools&quot; /&gt;&lt;/p&gt;
  18362. &lt;/blockquote&gt;
  18363. </description>
  18364.    </item>
  18365.    
  18366.    
  18367.    
  18368.    <item>
  18369.      <title>New Roller in the House</title>
  18370.      <link>https://tech.nextroll.com/blog/careers/2012/06/22/new-roller-in-house.html</link>
  18371.      <pubDate>Fri, 22 Jun 2012 00:00:00 -0700</pubDate>
  18372.      <author></author>
  18373.      <guid isPermaLink="false">https://tech.nextroll.com/blog/careers/2012/06/22/new-roller-in-house</guid>
  18374.      <description>&lt;p&gt;Meet Andrew Pascoe, data scientist!&lt;/p&gt;
  18375.  
  18376. &lt;blockquote&gt;
  18377.  &lt;p&gt;&lt;img src=&quot;/images/post_images/pascoe_joins.jpg&quot; alt=&quot;Andrew Pascoe&quot; /&gt;&lt;/p&gt;
  18378. &lt;/blockquote&gt;
  18379.  
  18380. &lt;p&gt;At AdRoll, product is everything. In order to continue offering our customers the most innovative and effective platform in the retargeting space, we’ve made some &lt;a href=&quot;http://blog.adroll.com/hires-new-branding-more&quot;&gt;rockstar additions&lt;/a&gt; to our engineering team as of late, growing our eng. team 50% this year.&lt;/p&gt;
  18381.  
  18382. &lt;p&gt;Andrew will contribute to the algorithms powering our efficient real-time bidder (RTBoodah) and the intelligent crunching of the vast amounts of data we collect. Before becoming a Roller, Andrew worked as mathematician for Dean Kamen’s DEKA Research &amp;amp; Development. Unfamiliar with DEKA? It’s the leading pioneer in innovative health and medical devices as well as the creator of the fan favorite Segway. Andrew applied fluid mechanics, Hamiltonian dynamics, and many more highly evolved niche sciences to new inventions.&lt;/p&gt;
  18383.  
  18384. &lt;p&gt;Before his time at DEKA, Andrew was also selected to work at the National Security Agency where he parsed huge data sets and applied statistical techniques and machine learning algorithms. Later he published his research internally and presented his findings to the Director of the NSA himself. Our team is still intently working on prying our nation’s security secrets out of him.&lt;/p&gt;
  18385. </description>
  18386.    </item>
  18387.    
  18388.    
  18389.    
  18390.    <item>
  18391.      <title>BuiltWith Lists AdRoll as Fastest Growing Ad Tech Company</title>
  18392.      <link>https://tech.nextroll.com/blog/adroll/2011/10/13/builtwith-lists-adroll-as-fastest-growing-ad-tech-company.html</link>
  18393.      <pubDate>Thu, 13 Oct 2011 00:00:00 -0700</pubDate>
  18394.      <author></author>
  18395.      <guid isPermaLink="false">https://tech.nextroll.com/blog/adroll/2011/10/13/builtwith-lists-adroll-as-fastest-growing-ad-tech-company</guid>
  18396.      <description>&lt;p&gt;There are many ways you can tell your startup is growing.  Of course, we have our own internalmetrics around the hundreds of new retargeting advertisers who start using AdRoll every month.  We’ve also found that two additional growth indicators are an increase in number of ramen packages consumed and an increase in boxes of gear delivered.&lt;/p&gt;
  18397.  
  18398. &lt;p&gt;As much as we love ramen and gear, there’s one thing we love more: neutral, 3rd-party data validation.&lt;/p&gt;
  18399.  
  18400. &lt;p&gt;This week we’re especially excited about topping &lt;a href=&quot;http://trends.builtwith.com/nostats/growth#!oneYear&quot;&gt;BuiltWith.com&lt;/a&gt;’s list of fastest growing online advertising technologies over thepast year. BuiltWith is a leading provider of website technology trends analysis and competitor intelligence products. They crawl the web, inspecting each site, and figure out which tech said sites are “built with”, so to speak. It’s a great feeling to be acknowledged by esteemed peers in the tech world. Without further ado…&lt;/p&gt;
  18401.  
  18402. &lt;center&gt;
  18403. &lt;img alt=&quot;Fastest Growing Advertising Technology&quot; src=&quot;/images/post_images/builtwith.jpg&quot; width=&quot;400&quot; /&gt;
  18404. &lt;/center&gt;
  18405. </description>
  18406.    </item>
  18407.    
  18408.    
  18409.  
  18410.  </channel>
  18411. </rss>
  18412.  
Copyright © 2002-9 Sam Ruby, Mark Pilgrim, Joseph Walton, and Phil Ringnalda