[Valid RSS] This is a valid RSS feed.


This feed is valid, but interoperability with the widest range of feed readers could be improved by implementing the following recommendations.


  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <?xml-stylesheet type="text/xsl" href="" media="screen" ?>
  3. <rdf:RDF xmlns:rdf="" xmlns:dc="" xmlns="">
  4. <channel rdf:about="">
  5. <title>Ned Batchelder's blog</title>
  6. <link></link>
  7. <description>Ned Batchelder's personal blog.</description>
  8. <dc:language>en-US</dc:language>
  9. <image rdf:resource=""/>
  10. <items>
  11. <rdf:Seq>
  12. <rdf:li resource=""/><rdf:li resource=""/><rdf:li resource=""/><rdf:li resource=""/><rdf:li resource=""/><rdf:li resource=""/><rdf:li resource=""/><rdf:li resource=""/><rdf:li resource=""/><rdf:li resource=""/>
  13. </rdf:Seq>
  14. </items>
  15. </channel>
  16. <image rdf:about="">
  17. <title>Ned Batchelder's blog</title>
  18. <link></link>
  19. <url></url>
  20. </image>
  21. <item rdf:about="">
  22. <title>Re-using my presentations</title>
  23. <link></link>
  24. <dc:date>2020-02-13T15:51:00-05:00</dc:date>
  25. <dc:creator>Ned Batchelder</dc:creator>
  26. <description><![CDATA[<p>Yesterday I got an email saying that someone in Turkey had stolen one of my
  27. presentations. The email included a YouTube link.  The video showed a meetup.
  28. The presenter (I&#8217;ll call him Samuel) was standing in front of a title slide in
  29. my style that said, &#8220;Big-O: How Code Slows as Data Grows,&#8221; which is the title of
  30. my <a href="" rel="external">PyCon 2018 talk</a>.</p><p>The video was in Turkish, so I couldn&#8217;t tell exactly what Samuel was saying,
  31. but I scrolled through the video, and sure enough, it was my entire talk,
  32. complete with
  33. <a href="" rel="external">illustrations by my
  34. son Ben</a>.</p><p>Looking closer, the title slide had been modified:</p><p class="figure"><img src="//" alt="My title slide, with someone else's name" width="1023" height="780"></p><p>(I&#8217;ve blurred Samuel&#8217;s specifics in this image, and Samuel is not his actual
  35. name.  This post isn&#8217;t about Samuel, and I&#8217;m not interested in directing any
  36. more negative attention to him.)</p><p>Scrolling to the end of the talk, my last slide, which repeated my name and
  37. contact details, was gone.  In its place was a slide promoting other videos
  38. featuring Samuel or his firm.</p><p>I felt like I had been forcibly elbowed off the stage, and Samuel was taking
  39. my place while trying to minimize my contributions.</p><p>In 2018, I did two things for this presentation: I wrote it, and I presented
  40. it at PyCon 2018.  By far the most work was in the writing.  It takes months of
  41. thinking, writing, designing, and honing to make a good presentation.  In fact,
  42. of the two types of work, Samuel valued the writing most, since that is the part
  43. he kept.  The reason this presentation attracted his attention, and why he
  44. wanted to present it himself, was because of its content.</p><p>&#8220;Originally presented by&#8221; is hardly the way to credit the author of a
  45. presentation, especially in small type while removing his name and leaving only
  46. a GitHub handle.</p><p>So I tweeted,</p><blockquote><div><p>This is my talk from PyCon 2018, in its entirety, with my name nearly
  47. removed. It&#8217;s theft. I was not asked, and did not give permission.</p></div></blockquote><p>Samuel apologized and took down the video. There were other tweets claiming
  48. that this was a pattern of Samuel&#8217;s, and that perhaps the apology would not be
  49. followed by changed behavior.  But again, this post isn&#8217;t about Samuel.</p><p>This whole event got me thinking about people re-using my presentations.</p><p>I enjoy writing presentations.  I like thinking about how to explain things.
  50. People have liked the explanations I&#8217;ve written.  I like that they like them
  51. enough to want to show them to people.</p><p>But I&#8217;ve never thought much about how I would answer if someone asked me if
  52. they could present one of my talks.  If people can use my talks to help
  53. strengthen their local community and up-skill their members, I want them to be
  54. able to. I am not interested in people using my talks to unfairly promote
  55. themselves.</p><p>I&#8217;m not sure re-using someone else&#8217;s presentation is a good idea.  Wouldn&#8217;t
  56. it be better to write your own talk based on what you learned from someone
  57. else&#8217;s?  But if people want to re-use a talk, I&#8217;d like to have an answer.</p><p>So here are my first-cut guidelines for re-using one of my talks:</p><ol>
  59. <li>Ask me if you can use a talk. If I say no, then you can&#8217;t.</li>
  61. <li>Don&#8217;t change the main title slide. I wrote the presentation, my name should
  62. be on it.  If you were lecturing about a novel, you wouldn&#8217;t hand out copies of
  63. the book with your name in place of the author&#8217;s.</li>
  65. <li>Make clear during the presentation that I was the author and first
  66. presenter.  A way to do that would be to include a slide about that first event,
  67. with links, and maybe even a screenshot of me from the video recording of the
  68. first event.</li>
  70. <li>I include a short-link and my Twitter handle in the footer of my
  71. slides.  Leave these in place.  We live in a social online world. I want to
  72. benefit from the connections that might arise from one of my presentations.</li>
  74. <li>Keep my name and contact details prominent in the end slide.</li>
  76. <li>If your video is posted online, include my name and the point about this
  77. being a re-run in the first paragraph of the description.</li>
  79. </ol><p>It would be great if my talks could get a broader reach than I can make
  80. happen all by myself.  To be honest, I&#8217;m still not sure if it&#8217;s a good idea to
  81. present someone else&#8217;s talk, but it&#8217;s better to do it this way than the way
  82. that just happened.</p>
  83. ]]></description>
  84. </item>
  85. <item rdf:about="">
  86. <title>sys.getsizeof is not what you want</title>
  87. <link></link>
  88. <dc:date>2020-02-09T07:54:00-05:00</dc:date>
  89. <dc:creator>Ned Batchelder</dc:creator>
  90. <description><![CDATA[<p>This week at work, an engineer mentioned that they were looking at the sizes
  91. of data returned by an API, and it was always coming out the same, which seemed
  92. strange.  It turned out the data was a dict, and they were looking at the size
  93. with <a href="" rel="external">sys.getsizeof</a>.</p><p>Sounds great! sys.getsizeof has an appealing name, and the description in the
  94. docs seems really good:</p><blockquote><div><p>sys.<b>getsizeof</b>(<i>object</i>)<br>
  95.    Return the size of an object in bytes. The object can be any type of object.
  96.    All built-in objects will return correct results [...]</p></div></blockquote><p>But the fact is, sys.getsizeof is almost never what you want, for two
  97. reasons: it doesn&#8217;t count all the bytes, and it counts the wrong bytes.</p><p>The docs go on to say:</p><blockquote><div><p>Only the memory consumption directly attributed to the object is
  98.    accounted for, not the memory consumption of objects it refers
  99.    to.</p></div></blockquote><p>This is why it doesn&#8217;t count all the bytes.  In the case of a dictionary,
  100. &#8220;objects it refers to&#8221; includes all of the keys and values.  getsizeof is only
  101. reporting on the memory occupied by the internal table the dict uses to track
  102. all the keys and values, not the size of the keys and values themselves.
  103. In other words, it tells you about the internal bookkeeping, and not any of your
  104. actual data!</p><p>The reason my co-worker&#8217;s API responses was all the same size was because
  105. they were dictionaries with the same number of keys, and getsizeof was ignoring
  106. all the keys and values when reporting the size:</p><blockquote class="code"><code><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">d1</span>&#xA0;<span class="o">=</span>&#xA0;<span class="p">{</span><span class="s2">&quot;a&quot;</span><span class="p">:</span>&#xA0;<span class="s2">&quot;a&quot;</span><span class="p">,</span>&#xA0;<span class="s2">&quot;b&quot;</span><span class="p">:</span>&#xA0;<span class="s2">&quot;b&quot;</span><span class="p">,</span>&#xA0;<span class="s2">&quot;c&quot;</span><span class="p">:</span>&#xA0;<span class="s2">&quot;c&quot;</span><span class="p">}</span>
  107. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">d2</span>&#xA0;<span class="o">=</span>&#xA0;<span class="p">{</span><span class="s2">&quot;a&quot;</span><span class="p">:</span>&#xA0;<span class="s2">&quot;a&quot;</span><span class="o">*</span><span class="mi">100</span><span class="n">_000</span><span class="p">,</span>&#xA0;<span class="s2">&quot;b&quot;</span><span class="p">:</span>&#xA0;<span class="s2">&quot;b&quot;</span><span class="o">*</span><span class="mi">100</span><span class="n">_000</span><span class="p">,</span>&#xA0;<span class="s2">&quot;c&quot;</span><span class="p">:</span>&#xA0;<span class="s2">&quot;c&quot;</span><span class="o">*</span><span class="mi">100</span><span class="n">_000</span><span class="p">}</span>
  108. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="n">d1</span><span class="p">)</span>
  109. <br><span class="go">232</span>
  110. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="n">d2</span><span class="p">)</span>
  111. <br><span class="go">232</span>
  112. <br></code></blockquote><p>If you wanted to know how large all the keys and values were, you could sum
  113. their lengths:</p><blockquote class="code"><code><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="k">def</span>&#xA0;<span class="nf">key_value_length</span><span class="p">(</span><span class="n">d</span><span class="p">):</span>
  114. <br><span class="gp">...&#xA0;</span>&#xA0;&#xA0;&#xA0;&#xA0;<span class="n">klen</span>&#xA0;<span class="o">=</span>&#xA0;<span class="nb">sum</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">k</span><span class="p">)</span>&#xA0;<span class="k">for</span>&#xA0;<span class="n">k</span>&#xA0;<span class="ow">in</span>&#xA0;<span class="n">d</span><span class="o">.</span><span class="n">keys</span><span class="p">())</span>
  115. <br><span class="gp">...&#xA0;</span>&#xA0;&#xA0;&#xA0;&#xA0;<span class="n">vlen</span>&#xA0;<span class="o">=</span>&#xA0;<span class="nb">sum</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">v</span><span class="p">)</span>&#xA0;<span class="k">for</span>&#xA0;<span class="n">v</span>&#xA0;<span class="ow">in</span>&#xA0;<span class="n">d</span><span class="o">.</span><span class="n">values</span><span class="p">())</span>
  116. <br><span class="gp">...&#xA0;</span>&#xA0;&#xA0;&#xA0;&#xA0;<span class="k">return</span>&#xA0;<span class="n">klen</span>&#xA0;<span class="o">+</span>&#xA0;<span class="n">vlen</span>
  117. <br><span class="gp">...</span>
  118. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">key_value_length</span><span class="p">(</span><span class="n">d1</span><span class="p">)</span>
  119. <br><span class="go">6</span>
  120. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">key_value_length</span><span class="p">(</span><span class="n">d2</span><span class="p">)</span>
  121. <br><span class="go">300003</span>
  122. <br></code></blockquote><p>You might ask, why is getsizeof like this? Wouldn&#8217;t it be more useful if it
  123. gave you the size of the whole dictionary, including its contents? Well, it&#8217;s
  124. not so simple.  Data in memory can be shared:</p><blockquote class="code"><code><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">x100k</span>&#xA0;<span class="o">=</span>&#xA0;<span class="s2">&quot;x&quot;</span>&#xA0;<span class="o">*</span>&#xA0;<span class="mi">100</span><span class="n">_000</span>
  125. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">d3</span>&#xA0;<span class="o">=</span>&#xA0;<span class="p">{</span><span class="s2">&quot;a&quot;</span><span class="p">:</span>&#xA0;<span class="n">x100k</span><span class="p">,</span>&#xA0;<span class="s2">&quot;b&quot;</span><span class="p">:</span>&#xA0;<span class="n">x100k</span><span class="p">,</span>&#xA0;<span class="s2">&quot;c&quot;</span><span class="p">:</span>&#xA0;<span class="n">x100k</span><span class="p">}</span>
  126. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">key_value_length</span><span class="p">(</span><span class="n">d3</span><span class="p">)</span>
  127. <br><span class="go">300003</span>
  128. <br></code></blockquote><p>Here there are three values, each 100k characters, but in fact, they are all
  129. the same value, actually the same object in memory.  That 100k string only
  130. exists once.  Is the &#8220;complete&#8221; size of the dict 300k? Or only 100k?</p><p>It depends on why you are asking about the size.  Our d3 dict is only about
  131. 100k bytes in RAM, but if we try to write it out, it will probably be about 300k
  132. bytes.</p><p>And sys.getsizeof also reports on the wrong bytes:</p><blockquote class="code"><code><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
  133. <br><span class="go">28</span>
  134. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="s2">&quot;a&quot;</span><span class="p">)</span>
  135. <br><span class="go">50</span>
  136. <br></code></blockquote><p>Huh? How can a small integer be 28 bytes? And the one-character string &#8220;a&#8221; is
  137. 50 bytes!? It&#8217;s because Python objects have internal bookkeeping, like links to
  138. their type, and reference counts for managing memory.  That extra bookkeeping
  139. is overhead per-object, and sys.getsizeof includes that overhead.</p><p>Because sys.getsizeof reports on internal details, it can be baffling:</p><blockquote class="code"><code><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="s2">&quot;a&quot;</span><span class="p">)</span>
  140. <br><span class="go">50</span>
  141. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="s2">&quot;ab&quot;</span><span class="p">)</span>
  142. <br><span class="go">51</span>
  143. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="s2">&quot;abc&quot;</span><span class="p">)</span>
  144. <br><span class="go">52</span>
  145. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="s2">&quot;á&quot;</span><span class="p">)</span>
  146. <br><span class="go">74</span>
  147. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="s2">&quot;áb&quot;</span><span class="p">)</span>
  148. <br><span class="go">75</span>
  149. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="s2">&quot;ábc&quot;</span><span class="p">)</span>
  150. <br><span class="go">76</span>
  151. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">face</span>&#xA0;<span class="o">=</span>&#xA0;<span class="s2">&quot;</span><span class="se">\N{GRINNING&#xA0;FACE}</span><span class="s2">&quot;</span>
  152. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="nb">len</span><span class="p">(</span><span class="n">face</span><span class="p">)</span>
  153. <br><span class="go">1</span>
  154. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="n">face</span><span class="p">)</span>
  155. <br><span class="go">80</span>
  156. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="n">face</span>&#xA0;<span class="o">+</span>&#xA0;<span class="s2">&quot;b&quot;</span><span class="p">)</span>
  157. <br><span class="go">84</span>
  158. <br><span class="gp">&gt;&gt;&gt;&#xA0;</span><span class="n">sys</span><span class="o">.</span><span class="n">getsizeof</span><span class="p">(</span><span class="n">face</span>&#xA0;<span class="o">+</span>&#xA0;<span class="s2">&quot;bc&quot;</span><span class="p">)</span>
  159. <br><span class="go">88</span>
  160. <br></code></blockquote><p>With an ASCII string, we start at 50 bytes, and need one more byte for each
  161. ASCII character.  With an accented character, we start at 74, but still only
  162. need one more byte for each ASCII character.  With an exotic Unicode character
  163. (expressed here with the little-used \N Unicode name escape), we start at 80,
  164. and then need four bytes for each ASCII character we add!  Why?  Because Python
  165. has a complex internal representation for strings. I don&#8217;t know why those
  166. numbers are the way they are.
  167. <a href="" rel="external">PEP 393</a> has the details
  168. if you are curious. The point here is: sys.getsizeof is almost certainly not the
  169. thing you want.</p><p>The &#8220;size&#8221; of a thing depends on how the thing is being represented. The
  170. in-memory Python data structures are one representation.  When the data is
  171. serialized to JSON, that will be another representation, with completely
  172. different reasons for the size it becomes.</p><p>In my co-worker&#8217;s case, the real question was, how many bytes will this be
  173. when written as CSV?  The sum-of-len method would be much closer to the right
  174. answer than sys.getsizeof.  But even sum-of-len might not be good enough,
  175. depending on how accurate the answer has to be.  Quoting rules and punctuation
  176. overhead change the exact length.  It might be that the only way to get an
  177. accurate enough answer is to serialize to CSV and check the actual result.</p><p>So: know what question you are really asking, and choose the right tool for
  178. the job. sys.getsizeof is almost never the right tool.</p>
  179. ]]></description>
  180. </item>
  181. <item rdf:about="">
  182. <title>Color palette tools</title>
  183. <link></link>
  184. <dc:date>2020-01-15T07:25:11-05:00</dc:date>
  185. <dc:creator>Ned Batchelder</dc:creator>
  186. <description><![CDATA[<p>Two useful sites for choosing color palettes, both from map-making
  187. backgrounds.  They both consider qualitative, sequential, and diverging palettes
  188. as different needs, which I found insightful.</p><ul>
  190. <li><a href="" rel="external">Paul Tol&#8217;s notes</a>, which gives
  191. special consideration to color-blindness.  He has some visual demonstrations
  192. that picked up my own slight color-blindness.</li>
  194. <li>Cynthia Brewer&#8217;s <a href="" rel="external">ColorBrewer</a>, with
  195. interactive elements so you can create your own palette for your particular
  196. needs.</li>
  198. </ul><p><a href="" rel="external">Color Palette Ideas</a> is different:
  199. palettes based on photographs, but can also be a good source for ideas.</p><p>As an update to my <a href="//">ancient blog post</a>
  200. about this same topic, <a href="" rel="external">Adobe Color</a> and
  201. <a href="" rel="external">paletton</a> both have tools for generating
  202. palettes in lots of over-my-head ways.  And <a href="" rel="external">Color Synth Axis</a>
  203. is still very appealing to the geek in me, though it needs Flash, and so I fear
  204. is not long for this world...</p>
  205. ]]></description>
  206. </item>
  207. <item rdf:about="">
  208. <title>Bug #915: solved!</title>
  209. <link></link>
  210. <dc:date>2020-01-13T06:15:00-05:00</dc:date>
  211. <dc:creator>Ned Batchelder</dc:creator>
  212. <description><![CDATA[<p>Yesterday I pleaded,
  213. <a href="//">Bug #915: please help!</a>
  214. It got posted to
  215. <a href="" rel="external">Hacker News</a>,
  216. where Robert Xiao (nneonneo) did some impressive debugging and
  217. <a href="" rel="external">found the answer</a>.
  218. </p><p>The user&#8217;s code used mocks to simulate an OSError when trying to make
  219. temporary files
  220. (<a href="" rel="external">source</a>):</p><blockquote class="code"><code><span class="k">with</span>&#xA0;<span class="n">patch</span><span class="p">(</span><span class="s1">&#39;tempfile._TemporaryFileWrapper&#39;</span><span class="p">)</span>&#xA0;<span class="k">as</span>&#xA0;<span class="n">mock_ntf</span><span class="p">:</span>
  221. <br>&#xA0;&#xA0;&#xA0;&#xA0;<span class="n">mock_ntf</span><span class="o">.</span><span class="n">side_effect</span>&#xA0;<span class="o">=</span>&#xA0;<span class="ne">OSError</span><span class="p">()</span>
  222. <br></code></blockquote><p>Inside tempfile.NamedTemporaryFile, the error handling misses the possibility
  223. that _TemporaryFileWrapper will fail
  224. (<a href="" rel="external">source</a>):</p><blockquote class="code"><code><span class="p">(</span><span class="n">fd</span><span class="p">,</span>&#xA0;<span class="n">name</span><span class="p">)</span>&#xA0;<span class="o">=</span>&#xA0;<span class="n">_mkstemp_inner</span><span class="p">(</span><span class="nb">dir</span><span class="p">,</span>&#xA0;<span class="n">prefix</span><span class="p">,</span>&#xA0;<span class="n">suffix</span><span class="p">,</span>&#xA0;<span class="n">flags</span><span class="p">,</span>&#xA0;<span class="n">output_type</span><span class="p">)</span>
  225. <br><span class="k">try</span><span class="p">:</span>
  226. <br>&#xA0;&#xA0;&#xA0;&#xA0;<span class="nb">file</span>&#xA0;<span class="o">=</span>&#xA0;<span class="n">_io</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span>&#xA0;<span class="n">mode</span><span class="p">,</span>&#xA0;<span class="n">buffering</span><span class="o">=</span><span class="n">buffering</span><span class="p">,</span>
  227. <br>&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;<span class="n">newline</span><span class="o">=</span><span class="n">newline</span><span class="p">,</span>&#xA0;<span class="n">encoding</span><span class="o">=</span><span class="n">encoding</span><span class="p">,</span>&#xA0;<span class="n">errors</span><span class="o">=</span><span class="n">errors</span><span class="p">)</span>
  228. <br>
  229. <br>&#xA0;&#xA0;&#xA0;&#xA0;<span class="k">return</span>&#xA0;<span class="n">_TemporaryFileWrapper</span><span class="p">(</span><span class="nb">file</span><span class="p">,</span>&#xA0;<span class="n">name</span><span class="p">,</span>&#xA0;<span class="n">delete</span><span class="p">)</span>
  230. <br><span class="k">except</span>&#xA0;<span class="ne">BaseException</span><span class="p">:</span>
  231. <br>&#xA0;&#xA0;&#xA0;&#xA0;<span class="n">_os</span><span class="o">.</span><span class="n">unlink</span><span class="p">(</span><span class="n">name</span><span class="p">)</span>
  232. <br>&#xA0;&#xA0;&#xA0;&#xA0;<span class="n">_os</span><span class="o">.</span><span class="n">close</span><span class="p">(</span><span class="n">fd</span><span class="p">)</span>
  233. <br>&#xA0;&#xA0;&#xA0;&#xA0;<span class="k">raise</span>
  234. <br></code></blockquote><p>If _TemporaryFileWrapper fails, the file descriptor fd is closed, but the
  235. file object referencing it still exists.  Eventually, it will be garbage
  236. collected, and the file descriptor it references will be closed again.</p><p>But file descriptors are just small integers which will be reused.  The
  237. failure in bug 915 is that the file descriptor did get reused, by SQLite.  When
  238. the garbage collector eventually reclaimed the file object leaked by
  239. NamedTemporaryFile, it closed a file descriptor that SQLite was using. Boom.</p><p>There are two improvements to be made here. First, the user code should be
  240. mocking public functions, not internal details of the Python stdlib. In
  241. fact, the variable is already named mock_ntf as if it had been a mock of
  242. NamedTemporaryFile at some point.</p><p>NamedTemporaryFile would be a better mock because that is the function being
  243. used by the user&#8217;s code.  Mocking _TemporaryFileWrapper is relying on an
  244. internal detail of the standard library.</p><p>The other improvement is to close the leak in NamedTemporaryFile.
  245. That request is now <a href="" rel="external">bpo39318</a>.
  246. As it happens, the leak had also been reported as
  247. <a href="" rel="external">bpo21058</a> and
  248. <a href="" rel="external">bpo26385</a>.</p><p>Lessons learned:</p><ul>
  250. <li>Hacker News can be helpful, in spite of the tangents about shell
  251. redirection, authorship attribution, and GitHub monoculture.</li>
  253. <li>There are always people more skilled at debugging. I had no idea you could
  254. <a href="" rel="external">script gdb</a>.</li>
  256. <li>Error handling is hard to get right. Edge cases can be really subtle.
  257. Bugs can linger for years.</li>
  259. </ul><p>I named Robert Xiao at the top, but lots of people chipped in effort to help
  260. get to the bottom of this. ikanobori posted it to Hacker News in the first
  261. place. Chris Caron reported the original #915 and stuck with the process as it
  262. dragged on. Thanks everybody.</p>
  263. ]]></description>
  264. </item>
  265. <item rdf:about="">
  266. <title>Bug #915: please help!</title>
  267. <link></link>
  268. <dc:date>2020-01-12T10:17:52-05:00</dc:date>
  269. <dc:creator>Ned Batchelder</dc:creator>
  270. <description><![CDATA[<p><b><i>Updated:</i></b> this was solved on Hacker News. Details in
  271. <a href="//">Bug #915: solved!</a>
  272. </p><p>I just released 5.0.3, with two bug fixes.  There was another bug
  273. I really wanted to fix, but it has stumped me.  I&#8217;m hoping someone can figure it
  274. out.</p><p><a href="" rel="external">Bug #915</a>
  275. describes a disk I/O failure.  Thanks to some help from Travis support, Chris
  276. Caron has provided instructions for reproducing it in Docker, and they work: I
  277. can generate disk I/O errors at will.  What I can&#8217;t figure out is what
  278. is doing wrong that causes the errors.</p><p>To reproduce it, start a Travis-based docker image:</p><blockquote class="code"><code><span class="nv">cid</span><span class="o">=</span><span class="k">$(</span>docker&#xA0;run&#xA0;-dti&#xA0;--privileged<span class="o">=</span><span class="nb">true</span>&#xA0;--entrypoint<span class="o">=</span>/sbin/init&#xA0;<span class="se">\</span>
  279. <br>&#xA0;&#xA0;&#xA0;&#xA0;-v&#xA0;/sys/fs/cgroup:/sys/fs/cgroup:ro&#xA0;<span class="se">\</span>
  280. <br>&#xA0;&#xA0;&#xA0;&#xA0;travisci/ci-sardonyx:packer-1542104228-d128723<span class="k">)</span>
  281. <br>docker&#xA0;<span class="nb">exec</span>&#xA0;-it&#xA0;<span class="nv">$cid</span>&#xA0;/bin/bash
  282. <br></code></blockquote><p>Then in the container, run these commands:</p><blockquote class="code"><code>su&#xA0;-&#xA0;travis
  283. <br>git&#xA0;clone&#xA0;--branch<span class="o">=</span>nedbat/debug-915&#xA0;
  284. <br><span class="nb">cd</span>&#xA0;apprise-api
  285. <br><span class="nb">source</span>&#xA0;~/virtualenv/python3.6/bin/activate
  286. <br>pip&#xA0;install&#xA0;tox
  287. <br>tox&#xA0;-e&#xA0;bad,good
  288. <br></code></blockquote><p>This will run two tox environments, called <b>good</b> and <b>bad</b>.  Bad
  289. will fail with a disk I/O error, good will succeed.  The difference is that bad
  290. uses the pytest-cov plugin, good does not.  Two detailed debug logs will be
  291. created: debug-good.txt and debug-bad.txt.  They show what operations were
  292. executed in the SqliteDb class in</p><p>The Big Questions: Why does bad fail? What is it doing at the SQLite level
  293. that causes the failure? And most importantly, what can I change in
  294. to prevent the failure?</p><p>Some observations and questions:</p><ul>
  296. <li>If I change the last line of the steps to &#8220;tox -e good,bad&#8221; (that is, run
  297. the environments in the other order) then the error doesn&#8217;t happen. I don&#8217;t
  298. understand why that would make a difference.</li>
  300. <li>I&#8217;ve tried adding time.sleep&#8217;s to try to slow the pace of database access,
  301. but maybe in not enough places? And if this fixes it, what&#8217;s the right way to
  302. productize that change?</li>
  304. <li>I&#8217;ve tried using the detailed debug log to create a small Python program
  305. that in theory accesses the SQLite database in exactly the same way, but I
  306. haven&#8217;t managed to create the error that way.  What aspect of access am I
  307. overlooking?</li>
  309. </ul><p>If you come up with answers to any of these questions, I will reward you
  310. somehow. I am also eager to chat if that would help you solve the mysteries.
  311. I can be reached on <a href="javascript:nospam(%22ned%22,;">email</a>,
  312. <a href="" rel="external">Twitter</a>,
  313. as <a href="irc://" rel="external">nedbat on IRC</a>,
  314. or in <a href="" rel="external">Slack</a>. Please get in
  315. touch if you have any ideas. Thanks.</p>
  316. ]]></description>
  317. </item>
  318. <item rdf:about="">
  319. <title>Season’s greetings</title>
  320. <link></link>
  321. <dc:date>2019-12-25T06:01:38-05:00</dc:date>
  322. <dc:creator>Ned Batchelder</dc:creator>
  323. <description><![CDATA[<p>Our card this year, drawn by <a href="" rel="external">Ben</a>, of
  324. course.  The five gnomes are Susan, me, Ben, Max, and Nat:</p><p class="figure"><img src="//" alt="Five gnomes around a sweater-wearing tree" width="600" height="900" class="thinline"></p>
  325. ]]></description>
  326. </item>
  327. <item rdf:about="">
  328. <title>Fancy console output in GitHub comments</title>
  329. <link></link>
  330. <dc:date>2019-12-17T07:47:00-05:00</dc:date>
  331. <dc:creator>Ned Batchelder</dc:creator>
  332. <description><![CDATA[<p>Providing detailed command output in GitHub issues is hard: I want to be
  333. complete, but I don&#8217;t want to paste unreadable walls of text.  Some commands
  334. have long output that is usually uninteresting (pip install), but which every
  335. once in a while has a useful clue.  I want to include that output without making
  336. it hard to find the important stuff.</p><p>While working on an issue with <a href="//"> 5.0</a>,
  337. I came up with a way to show commands and their output that I think works
  338. well.</p><p>I used GitHub&#8217;s &lt;details&gt; support to
  339. <a href="" rel="external">show
  340. the commands I ran with their output in collapsible sections</a>.  I like the
  341. way it came out: you can copy all the commands, or open a section to see what
  342. happened for the command you&#8217;re interested in.</p><p>The raw markdown looks like this:</p><blockquote class="code"><code><span class="p">&lt;</span><span class="nt">details</span><span class="p">&gt;</span>
  343. <br><span class="p">&lt;</span><span class="nt">summary</span><span class="p">&gt;</span>cd&#xA0;meltano<span class="p">&lt;/</span><span class="nt">summary</span><span class="p">&gt;</span>
  344. <br><span class="p">&lt;/</span><span class="nt">details</span><span class="p">&gt;</span>
  345. <br>
  346. <br><span class="p">&lt;</span><span class="nt">details</span><span class="p">&gt;</span>
  347. <br><span class="p">&lt;</span><span class="nt">summary</span><span class="p">&gt;</span>pip&#xA0;install&#xA0;&#39;.[dev]&#39;<span class="p">&lt;/</span><span class="nt">summary</span><span class="p">&gt;</span>
  348. <br>
  349. <br>```
  350. <br>Processing&#xA0;/private/tmp/bug881a/meltano
  351. <br>Collecting&#xA0;aenum==2.1.2
  352. <br>&#xA0;&#xA0;Using&#xA0;cached&#xA0;
  353. <br>Collecting&#xA0;idna==2.7
  354. <br>&#xA0;&#xA0;Using&#xA0;cached&#xA0;
  355. <br>Collecting&#xA0;asn1crypto==0.24.0
  356. <br>&#xA0;&#xA0;Using&#xA0;cached&#xA0;
  357. <br>(etc)
  358. <br>```
  359. <br>
  360. <br><span class="p">&lt;/</span><span class="nt">details</span><span class="p">&gt;</span>
  361. <br></code></blockquote><p>(The GitHub renderer was very particular about the blank lines around the
  362. &lt;details&gt; and &lt;summary&gt; tags, so be sure to include them if you try
  363. this.)</p><p>Other people have done this: after I wrote this comment, one of the newer
  364. issues used the same technique, but with &lt;tt&gt; in the summaries to
  365. make them look like commands, nice. There are a few manual steps to get that
  366. result, but I&#8217;ll be refining how to produce that style more conveniently from a
  367. terminal console.</p>
  368. ]]></description>
  369. </item>
  370. <item rdf:about="">
  371. <title>Pytest trick: subsetting unknown suites</title>
  372. <link></link>
  373. <dc:date>2019-12-17T07:46:00-05:00</dc:date>
  374. <dc:creator>Ned Batchelder</dc:creator>
  375. <description><![CDATA[<p>While trying to reproduce an issue with <a href="//"> 5.0</a>,
  376. I had a test suite that showed the problem, but it was inconvenient to run the
  377. whole suite repeatedly, because it took too long.  I wanted to find just one
  378. test (or small handful of tests) that would demonstrate the problem.</p><p>But I knew nothing about these tests. I didn&#8217;t know what subset might be
  379. useful, or even what subsets there were, so I had to try random subsets and hope
  380. for the best.</p><p>I selected random subsets with a new trick: I used
  381. the -k option (select tests by a substring of their names) using single
  382. consonants.  &#8220;pytest -k b&#8221; will run only the tests with a b in their name, for
  383. example.  Then I tried &#8220;-k c&#8221;, &#8220;-k d&#8221;, &#8220;-k f&#8221;, and so on.  Some will run the
  384. whole test suite (&#8220;-k t&#8221; is useless because t is in every test name), but some
  385. ran usefully small collections.</p><p>This is a mindless way to select tests, but I knew nothing about this test
  386. suite, so it was a quick way to run fewer than all of them.  Running &#8220;-k q&#8221; was
  387. the best (only 16 tests). Then I looked at the test names, and selected yet
  388. smaller subsets with more thought.  In the end, I could reduce it to just one
  389. test that demonstrated the problem.</p>
  390. ]]></description>
  391. </item>
  392. <item rdf:about="">
  393. <title>Coverage 5.0, finally</title>
  394. <link></link>
  395. <dc:date>2019-12-17T07:45:00-05:00</dc:date>
  396. <dc:creator>Ned Batchelder</dc:creator>
  397. <description><![CDATA[<p>After a quiet week of <a href="//">beta 2</a>
  398. being available, and not hearing from anyone, I released
  399. <a href="" rel="external"> 5.0</a> on
  400. Saturday.</p><p>I&#8217;ve been through this before, so I knew what would happen: people with
  401. unpinned requirements would invisibly upgrade their coverage version, and stuff
  402. would break. is used by many projects, so it was inevitable.</p><p>Saturday afternoon was quiet. Sunday I heard from two people.  Then Monday,
  403. people came back to work to find their continuous integration broken, and now
  404. I&#8217;m up to <a href="" rel="external">11 issues</a>
  405. to deal with.</p><p>It remains difficult to get people to provide instructions that are specific
  406. enough and detailed enough for me to see their problem.  A link to your broken
  407. CI build doesn&#8217;t tell me how to do it myself.  A link to your repo is confusing
  408. if you then add a commit that pins the old version of coverage to prevent the
  409. problem, forcing me to dig through your history to try to find the old commit
  410. that was broken.  And so on.</p><p>Of course, this is nothing new, but it drove home again how hard it is to
  411. extract good information from distracted and annoyed users. If anyone has good
  412. examples of issue templates that get people&#8217;s attention and guide them well,
  413. point me to them!</p><p>While dealing with the issues, I came up with two new techniques, interesting
  414. enough to deserve their own blog posts:</p><ul>
  415. <li><a href="//">Fancy console output in GitHub comments</a></li>
  416. <li><a href="//">Pytest trick: subsetting unknown suites</a></li>
  417. </ul><p>Needless to say, fixes are underway for a 5.0.1 to be released soon.</p>
  418. ]]></description>
  419. </item>
  420. <item rdf:about="">
  421. <title>Coverage 5.0 beta 2</title>
  422. <link></link>
  423. <dc:date>2019-12-08T15:10:00-05:00</dc:date>
  424. <dc:creator>Ned Batchelder</dc:creator>
  425. <description><![CDATA[<p>I mean it this time, 5.0 is nearly ready.  I&#8217;m putting out
  426. <a href="" rel="external"> 5.0 beta 2</a>
  427. for a week before declaring it really done.  Please try it.</p><p>Everything I said in the <a href="//">beta 1 announcement</a>
  428. still holds: please try the new things!</p><p>Thanks.</p>
  429. ]]></description>
  430. </item>
  431. </rdf:RDF>

If you would like to create a banner that links to this page (i.e. this validation result), do the following:

  1. Download the "valid RSS" banner.

  2. Upload the image to your own server. (This step is important. Please do not link directly to the image on this server.)

  3. Add this HTML to your page (change the image src attribute if necessary):

If you would like to create a text link instead, here is the URL you can use:

Copyright © 2002-9 Sam Ruby, Mark Pilgrim, Joseph Walton, and Phil Ringnalda