<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Martin Kleppmann at Yes/No/Cancel &#187; Uncategorized</title>
	<atom:link href="http://www.yes-no-cancel.co.uk/category/uncategorized/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.yes-no-cancel.co.uk</link>
	<description>Entrepreneurship, web technology and the user experience</description>
	<lastBuildDate>Mon, 30 Aug 2010 23:36:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Good things are hard to articulate</title>
		<link>http://www.yes-no-cancel.co.uk/2010/08/31/good-things-are-hard-to-articulate/</link>
		<comments>http://www.yes-no-cancel.co.uk/2010/08/31/good-things-are-hard-to-articulate/#comments</comments>
		<pubDate>Mon, 30 Aug 2010 23:36:49 +0000</pubDate>
		<dc:creator>Martin Kleppmann</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=394</guid>
		<description><![CDATA[If you have a stone in your shoe, it’s easy to articulate what is wrong (“something is hurting my foot!”). But if you don’t have a stone in your shoe, you don’t go around rejoicing with every step. If you have just removed a stone, you may be pleased for a few seconds to have [...]]]></description>
			<content:encoded><![CDATA[<p>If you have a stone in your shoe, it’s easy to articulate what is wrong (“something is hurting my foot!”). But if you don’t have a stone in your shoe, you don’t go around rejoicing with every step. If you have just removed a stone, you may be pleased for a few seconds to have removed the irritation, but things very quickly return to normal.</p>
<p>The bad things keep sticking around and irritating us, whereas the good things are quickly taken for granted. The bad things are often easy to describe, whereas the good things are sometimes just the absence of a bad thing — the absence of some irritation.</p>
<p>This is a fundamental asymmetry, and it has lots of implications, big and small. For example, a big implication is the development of human society and technology. Because we are more prone to noticing bad things, and bad things keep bugging us, we try to fix bad things and make them go away. So humans invented tools (to stop hurting their fingers), agriculture (to stop hunger and laborious hunting/gathering), medicine (to stop their family members from dying), writing (to stop forgetfulness) and the internet (to stop the slowness of paper communication). Over history, humans have continuously taken stuff that is bad, and tried to make it better. Now that we have all these good things (tools, agriculture, medicine, writing, internet), we mostly take them for granted and don’t think about them any more.</p>
<p>(Some things that humans invented turned out to have bad side-effects besides their intended good effect. Take weapons as an example: intended to stop the hungry neighbours from stealing our food, but the idea got taken a bit too far. But that’s a different story.)</p>
<p>There are also other, less grandiose consequences of bad things being more noticeable. For example, when I was writing music and song lyrics, I found it comparatively easy to write about themes like conflict, struggle and sadness; writing about good things, however, was much harder. It would often just sound trite, banal and uninteresting.</p>
<p>How do you articulate those things that are good? Part of the point of this very essay is to see if I can write about things being good without just being incredibly boring. And what do I do? Complain about the fact that it’s hard to express things that are good. I complain. I am irritated that bad things are easier to write about than good things. And so I write about that irritation, thereby locking myself in a self-referential loop.</p>
<p>How annoying. Let’s talk about the good things again.</p>
<p>An observation. If bad things are much more noticeable than good things, that means that it’s very easy to lose sight of all the good things. In fact, it probably means that we are surrounded by lots of wonderful things that we’ve simply forgotten about. It might even be that the vast majority of things around us are good, and we are just failing to notice them!</p>
<p>We need to consciously remind ourselves of the good things from time to time, to avoid getting too bogged down in the bad things. That cannot mean getting complacent; it just means enjoying and appreciating life.</p>
<p>For example, I am writing this essay on a mobile phone (good) connected to the internet (good) while sitting in the sunshine (good) in a park (good) in San Francisco (good). I am wearing comfortable clothes (good) and don’t have any stones in my shoes (good). I live in a peaceful age (good) in a peaceful society (good). I am educated (good), healthy (good), don’t need to worry about being able to afford the rent (good) and I have a wonderful family (good). I look forward to sending this essay to Rita (good) to see what she thinks. (Hopefully it’s not too bad.)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2010/08/31/good-things-are-hard-to-articulate/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Our social future</title>
		<link>http://www.yes-no-cancel.co.uk/2010/04/09/our-social-future/</link>
		<comments>http://www.yes-no-cancel.co.uk/2010/04/09/our-social-future/#comments</comments>
		<pubDate>Fri, 09 Apr 2010 01:59:11 +0000</pubDate>
		<dc:creator>Martin Kleppmann</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[rapportive]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=376</guid>
		<description><![CDATA[You know you&#8217;re doing something good if someone you&#8217;ve never met before spontaneously comes up to you and asks: &#8220;Are you with Rapportive? Just wanted to say that I love it.&#8221; He could recognise me because we had exchanged emails, and he had therefore seen my photo in Rapportive next to my email. This is [...]]]></description>
			<content:encoded><![CDATA[<p>You know you&#8217;re doing something good if someone you&#8217;ve never met before spontaneously comes up to you and asks: &#8220;Are you with Rapportive? Just wanted to say that I love it.&#8221;</p>
<p>He could recognise me because we had exchanged emails, and he had therefore seen my photo in Rapportive next to my email.</p>
<p>This is our social future. We&#8217;re busy creating it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2010/04/09/our-social-future/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ending Browser Pain on the Startup Success Podcast</title>
		<link>http://www.yes-no-cancel.co.uk/2009/11/25/ending-browser-pain-on-the-startup-success-podcast/</link>
		<comments>http://www.yes-no-cancel.co.uk/2009/11/25/ending-browser-pain-on-the-startup-success-podcast/#comments</comments>
		<pubDate>Wed, 25 Nov 2009 00:17:09 +0000</pubDate>
		<dc:creator>Martin Kleppmann</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=338</guid>
		<description><![CDATA[I was lucky to get a chance to be interviewed by the great Bob Walsh, founder of StartupToDo, and author of the Web Startup Success Guide (review by Joel Spolsky, review by Neil Davidson). The interview is for the Startup Success Podcast, a series of shows providing a wealth of useful information and inspiration for [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://startuppodcast.wordpress.com/2009/11/24/show-46-ending-browser-pain-martin-kleppmann-go-test-it/"><img class="alignleft size-full wp-image-339" title="Startup Success Podcast" src="http://www.yes-no-cancel.co.uk/wp-content/uploads/2009/11/ssplogo3.jpg" alt="Startup Success Podcast" width="261" height="147" /></a>I was lucky to get a chance to be interviewed by the great <a href="http://twitter.com/BobWalsh">Bob Walsh</a>, founder of <a href="http://startuptodo.com">StartupToDo</a>, and author of the <a href="http://www.amazon.com/Startup-Success-Guide-Books-Professionals/dp/1430219858">Web Startup Success Guide</a> (<a href="http://www.47hats.com/2009/07/joel-spolsky-on-the-web-startup-success-guide/">review by Joel Spolsky</a>, <a href="http://blog.businessofsoftware.org/2009/08/the-web-startup-success-guide---a-book-review.html">review by Neil Davidson</a>).</p>
<p>The interview is for the <a href="http://startuppodcast.wordpress.com/">Startup Success Podcast</a>, a series of shows providing a wealth of useful information and inspiration for startups. In this episode, <a href="http://blogs.msdn.com/patrick_foley/">Patrick Foley</a> talks about his visit to the <a href="http://microsoftpdc.com/">Microsoft Professional Developers Conference (PDC)</a>, and I talk about <a href="http://go-test.it/">Go Test It</a> – what it is, how it works, why we built it, where it is going in future. There’s even a special discount in there! :)</p>
<p>Head over now to the Startup Success Podcast and <a href="http://startuppodcast.wordpress.com/2009/11/24/show-46-ending-browser-pain-martin-kleppmann-go-test-it/">listen to the episode</a>! (The interview with me starts at about 15 minutes in.)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2009/11/25/ending-browser-pain-on-the-startup-success-podcast/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Doing a PhD</title>
		<link>http://www.yes-no-cancel.co.uk/2009/03/31/doing-a-phd/</link>
		<comments>http://www.yes-no-cancel.co.uk/2009/03/31/doing-a-phd/#comments</comments>
		<pubDate>Tue, 31 Mar 2009 08:05:43 +0000</pubDate>
		<dc:creator>Martin Kleppmann</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[computational linguistics]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[phd]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social web]]></category>
		<category><![CDATA[university]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=251</guid>
		<description><![CDATA[I have decided to apply to do a PhD in Cambridge. This might come as a surprise, so please let me explain. It is something which has been tempting me for a long time. I have always loved working independently and getting deep into a project which I find cool, and a PhD (in computer [...]]]></description>
			<content:encoded><![CDATA[<p>I have decided to apply to do a PhD in Cambridge.</p>
<p>This might come as a surprise, so please let me explain. It is something which has been tempting me for a long time. I have always loved working independently and getting deep into a project which I find cool, and a PhD (in computer science at least) seemed to me the ultimate manifestation of this independence: three years in which you can learn about and figure out an interesting topic, and invent new ways, with hardly any constraints other than that you&#8217;re supposed to write up something vaguely insightful at the end. (I&#8217;m sure this is an overly idealised notion of what a PhD entails, but please bear with my dreamworld for now.)</p>
<p>On the other hand, I have started a company and I&#8217;ve had an incredibly experience-packed two years so far doing that. I would be a completely different person now if I had gone directly into a PhD after graduating. Running a start-up has made me less risk-averse, more dynamic, more outgoing and confident, more pragmatic, more focussed, and has given me a much better understanding of how the world works.</p>
<p>You might think that returning to university is a cop-out, a return from the harsh winds of a start-up into the safe haven of academia. Let me assure you that this is not the case, for two reasons.</p>
<ul>
<li>Firstly, I will keep my company going on the side. Obviously I won&#8217;t be doing it full-time any more, but I am keeping all of my active clients, and I will continue the high level of service they know from me. It won&#8217;t be a return to student lifestyle for me; if anything, my focus will get sharper.</li>
<li>Secondly, the research proposal I have written is not just any proposal. It is aimed squarely at what I (and many others) believe will be the most influential technologies of the next decade or two: technology which deals with the vast amount of data on the web, filtering, processing and mining that information such that it becomes a source of useful insight. <strong>Machine learning</strong> and <strong>computational linguistics</strong>.</li>
</ul>
<p>We are rapidly moving towards a world where everything which can be digitised and put on the web will be. Blogs, social networking sites, Twitter and many other services increasingly become expressions of a person&#8217;s identity. Already now I find that if, for example, I receive an email from somebody I don&#8217;t know, often the first thing I do is to look up them up on LinkedIn, find their blog or Twitter username, look up their company or affiliation and find out what they do. This allows me to quickly judge the context of their enquiry, gauge the level at which I should reply, or detect whether I need to be cautious for some reason. If it is somebody I have dealt with before, I have a private database of contact history which helps when I don&#8217;t remember details of conversations months or years back. (It&#8217;s nothing particularly secret, it&#8217;s just an extended memory.)</p>
<p>Identity on the web further manifests itself in social interactions with others. This can be a powerful source of insight: for example, if I don&#8217;t know somebody, but I see that they publicly communicate with somebody I already know and trust, I will immediately be more inclined to trust them too. This is not a rigorous decision, but a useful first guess in the absence of other information.</p>
<p>However, gathering the pieces of a person&#8217;s identity from across the web is currently a time-consuming manual process.</p>
<p>My own digital identity, for example, is spread all over the interwebs. It manifests itself in <a href="http://www.yes-no-cancel.co.uk/">my blog</a> (which you are currently reading), <a href="http://www.linkedin.com/in/martinkleppmann">my LinkedIn profile</a>, <a href="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-683.html">my undergraduate dissertation</a>, <a href="http://github.com/ept/">my open source projects</a>, <a href="http://www.eptcomputing.com/">my company</a>, <a href="http://twitter.com/martinkl">my tweets</a>, <a href="http://www.facebook.com/profile.php?id=558703060">my Facebook profile</a>, <a href="http://flickr.com/photos/martinkleppmann/">my photos</a> and <a href="http://www.last.fm/user/mk428">my taste in music</a>, not to mention the many other fragments scattered about other sites, in the form of press articles about me or my projects, my archived emails to mailing lists, my comments on other people&#8217;s blogs, etc. They are all there, and Google has indexed them all (apart from the small number of things behind logins), but at the moment they do not come together to form a coherent whole. They are scraps of data, but without further analysis they don&#8217;t mean much.</p>
<p>In a nutshell, my PhD proposal is to gather that publicly available data together and make it useful. For me, that means to map out the graph of connections between different people, and relationships between people and topics. Who is interested in what, and who discusses what topics with which people?</p>
<p>Consider an example to see why this might be useful: say you are new to a particular field of specialism (whatever it may be), and you attend a conference to find out more about it. The programme of the conference is a long list of sessions with names of speakers and titles of talks. Someone who has been in the field for a while will know many of the speakers&#8217; names and will immediately know which sessions will be worth attending and which people to talk to. But a newcomer will have no idea, and has no way to find out other than by spending years getting to know the community. Why can&#8217;t you just visualise the relationships between the various speakers and topics, so that you can immediately see who the most influential presenters are and whose interests are closest to your own? Or even discover which attendees of the event would be most worth talking to? At the moment we rely on personal referrals, serendipitous meetings and crude markers (like <a href="http://www.firsttuesday.co.uk/">First Tuesday</a>&#8216;s <a href="http://www.independent.co.uk/news/business/news/disciples-stay-faithful-to-dot-coms-713036.html">&#8220;green for start-up, red for investor, yellow for service provider&#8221;</a>); why can&#8217;t we have a more direct way of finding the people we should be talking to?</p>
<p>There are two steps to making this work: firstly identifying which two bits of information on the web belong to the same person (even if they are on different websites, using a variant spelling of the name or pseudonym-like username, and without confusing two people who happen to share the same name), and secondly mapping out the relationships between the people and the topics they talk about.</p>
<p>Google&#8217;s success rests, amongst other things, on the <a href="http://ilpubs.stanford.edu:8090/422/">PageRank algorithm</a> which calculates a &#8216;quality&#8217; rating for each page on the web. Their core innovation was to realise that links between pages, not just the pages&#8217; content, were the measure which determined how useful a search result would be, and implementing PageRank allowed them to achieve much better search results than the other search engines at the time.</p>
<p>A lot has been said about the next big thing post-Google. I wouldn&#8217;t want to make predictions, but let&#8217;s put it this way: I would not be surprised if the next core innovation is to realise that individual people, and the connections between people, are even more powerful than pages and links between pages. The marriage of social web and semantic web, to be fully buzzword-compliant.</p>
<p>This is a difficult and multi-faceted problem, which is why I want to take it on within the framework of a PhD rather than try to develop it as a product straight away. There is a lot I need to learn, from the mathematical details of the best machine learning techniques to the linguistic techniques needed to extract structured information from natural language text and small clues on the web. </p>
<p>There is a lot of existing research on which to build. <a href="http://www.cl.cam.ac.uk/users/sc609">Stephen Clark</a>, my proposed PhD supervisor, is one of the authors of the <a href="http://svn.ask.it.usyd.edu.au/trac/candc/wiki">C&#038;C parser</a>, which is maybe the finest statistical natural language parser out there; also in the Computer Lab&#8217;s <a href="http://www.cl.cam.ac.uk/research/nl/">Natural Language and Information Processing Group</a>, <a href="http://www.cl.cam.ac.uk/~sht25/">Simone Teufel</a> and others&#8217; work on <a href="http://www.cl.cam.ac.uk/~sht25/Project_Index/Citraz_Index.html">citation analysis</a> is likely to be relevant. And I hope to collaborate with the lovely people at the Cambridge Engineering department&#8217;s <a href="http://mlg.eng.cam.ac.uk/">Machine Learning group</a>, including <a href="http://learning.eng.cam.ac.uk/zoubin/">Zoubin Ghahramani</a> who is recognised as one of the top researchers worldwide in the machine learning field. Very good reasons to be in Cambridge.</p>
<p>Please note that this is not at all certain yet &#8212; I have applied, but I may not get accepted, I may not get funding, and the Board of Graduate Studies may lose/forget my papers. But all going well, this is the general direction in which I&#8217;d like to head.</p>
<p>On a final note, it will also be interesting to explore the ethical aspects of identity on the web. I believe that both open sharing of personal information and automated mining of that information will increase massively in the coming years, and exploring the ethical and social consequences, as well as protecting the rights of the individual, should be a part of the research in this area.</p>
<p>PS. My favourite techy buzzword so far is <strong>&#8220;maximum entropy supertagger&#8221;</strong> (<a href="http://portal.acm.org/citation.cfm?id=1220396">one of the components</a> of the C&#038;C parser). Just say that out loud. Maximum Entropy Supertagger. Doesn&#8217;t it sound awesome? Before my inner eye, there is a sci-fi film of a group of heroes fighting off an alien invasion. The tentacled beasts from outer space are everywhere, but the good guys are just managing to keep them at bay. But then&#8230; ominous music in the background&#8230; a huge towering construction appears from behind a hill in the distance. Silence falls. Everybody stares at the terrifying thing brought by the aliens. The guy who later will be in charge of single-handedly saving the world turns around, and in a brief close-up shot he says to his colleagues in a perfect Hollywood manner: <em>&#8220;Oh my God. They&#8217;ve got a Maximum Entropy Supertagger.&#8221;</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2009/03/31/doing-a-phd/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>CamCow (Cambridge Coworking) Update</title>
		<link>http://www.yes-no-cancel.co.uk/2009/02/05/camcow-cambridge-coworking-update/</link>
		<comments>http://www.yes-no-cancel.co.uk/2009/02/05/camcow-cambridge-coworking-update/#comments</comments>
		<pubDate>Thu, 05 Feb 2009 09:34:21 +0000</pubDate>
		<dc:creator>Martin Kleppmann</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[camcow]]></category>
		<category><![CDATA[coworking]]></category>
		<category><![CDATA[refreshcamb]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=227</guid>
		<description><![CDATA[Yesterday evening I went to a Refresh Cambridge event &#8212; there was an interesting talk by Matt Wood from the Sanger Institute on how they use Scrum in a very agile scientific computing environment. Matt&#8217;s slides are online. Then, before we all went off to the pub, I gave a quick update on our progress [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday evening I went to a <a href="http://www.refreshcambridge.org/">Refresh Cambridge</a> event &#8212; there was an interesting talk by <a href="http://www.greenisgood.co.uk/">Matt Wood</a> from the <a href="http://www.sanger.ac.uk/Users/mw4/">Sanger Institute</a> on <a href="http://greenisgood.co.uk/pages/show/introduction_to_scrum">how they use Scrum</a> in a very agile scientific computing environment. <a href="http://www.slideshare.net/mza/introduction-to-scrum">Matt&#8217;s slides are online</a>.</p>
<p>Then, before we all went off to the pub, I gave a quick update on our progress on setting up a <a href="http://camcow.org">coworking space in Cambridge under the name of &#8220;CamCow&#8221;</a>. Here are my slides (<a href="http://www.slideshare.net/martinkleppmann/camcow-building-a-coworking-space-for-cambridge">on SlideShare</a>):</p>
<div style="width:425px;text-align:left" id="__ss_991691"><object style="margin:0px" width="425" height="355"><param name="movie" value="http://static.slideshare.net/swf/ssplayer2.swf?doc=camcowrefresh20090204-1233824337412768-2&#038;rel=0&#038;stripped_title=camcow-building-a-coworking-space-for-cambridge" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed src="http://static.slideshare.net/swf/ssplayer2.swf?doc=camcowrefresh20090204-1233824337412768-2&#038;rel=0&#038;stripped_title=camcow-building-a-coworking-space-for-cambridge" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object></div>
<p>If you&#8217;re interested in looking around the new coworking space or want to find out more, please <a href="/contact/">get in touch</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2009/02/05/camcow-cambridge-coworking-update/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Bupa&#8217;s &#8216;Quitclock&#8217; Facebook app now online</title>
		<link>http://www.yes-no-cancel.co.uk/2009/01/01/bupas-quitclock-facebook-app-now-online/</link>
		<comments>http://www.yes-no-cancel.co.uk/2009/01/01/bupas-quitclock-facebook-app-now-online/#comments</comments>
		<pubDate>Thu, 01 Jan 2009 13:20:24 +0000</pubDate>
		<dc:creator>Martin Kleppmann</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=211</guid>
		<description><![CDATA[Just a brief note to say that the Quitclock Facebook application which I developed for the health insurance company Bupa has now been officially launched (see the press release). Purpose of the application is to help people who want to give up smoking. Users can enter the date when they stopped smoking, and get updates [...]]]></description>
			<content:encoded><![CDATA[<p>Just a brief note to say that the <a href="http://www.facebook.com/apps/application.php?id=39245025835">Quitclock Facebook application</a> which I developed for the <a href="http://www.bupa.co.uk/">health insurance company Bupa</a> has now been officially launched (<a href="http://www.bupa.co.uk/health_information/html/health_news/171208_quitclock.html">see the press release</a>).</p>
<p>Purpose of the application is to help people who want to give up smoking. Users can enter the date when they stopped smoking, and get updates on how much money they have saved, display a box on their profile, get support messages etc. The application was launched now because many people try to give up smoking as their new year&#8217;s resolution.</p>
<p>The application is hosted on Google App Engine, marking my first real-life use of GAE. It has been a bit frustrating at times, and I may post some more comments about my App Engine experience at some later stage.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2009/01/01/bupas-quitclock-facebook-app-now-online/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Load/performance testing a Rails application with ApacheBench</title>
		<link>http://www.yes-no-cancel.co.uk/2008/10/27/load-performance-testing-a-rails-application-with-apachebench/</link>
		<comments>http://www.yes-no-cancel.co.uk/2008/10/27/load-performance-testing-a-rails-application-with-apachebench/#comments</comments>
		<pubDate>Mon, 27 Oct 2008 22:16:24 +0000</pubDate>
		<dc:creator>Martin Kleppmann</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=166</guid>
		<description><![CDATA[Just over 3 days until Bid for Wine goes online! It&#8217;s great to see this massive project, which I&#8217;ve blogged about before, finally complete. We will launch on Friday 31 October, the first lots for sale are already lined up (including a remarkable bottle &#8211; a unique item from a private bottling of the famous Guigal Family, [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.yes-no-cancel.co.uk/wp-content/uploads/2008/09/bidforwine.png"><img class="size-full wp-image-140 alignright" style="margin: 20px;" title="Bid for Wine logo" src="http://www.yes-no-cancel.co.uk/wp-content/uploads/2008/09/bidforwine.png" alt="Bid for Wine logo" width="375" height="95" align="right" /></a></p>
<p>Just over 3 days until <a href="http://www.bidforwine.co.uk/">Bid for Wine</a> goes online! It&#8217;s great to see this massive project, which <a href="/2008/09/22/bid-for-wine-online-wine-auctions-coming-soon/">I&#8217;ve blogged about before</a>, finally complete. We will launch on Friday 31 October, the first lots for sale are already lined up (including <a href="http://bidforwine.wordpress.com/2008/09/28/a-unique-chance/">a remarkable bottle</a> &#8211; a unique item from a private bottling of the famous <a href="http://www.guigal.com/vignoble.php?langue=en&amp;rub=1&amp;srub=1">Guigal Family</a>, and probably the only bottle of its kind on the open market!), some big wine magazines are going to be reporting, and everybody is getting very excited.</p>
<p>The application is deployed to the servers (we are running on <a href="http://engineyard.com/">Engine Yard</a>), the DNS is updated, the load balancers are configured, the holding page and contact form are already getting a fair bit of traffic, everything seems ready. We just need to flick a switch and users will start hitting the site.</p>
<p>Hold on. How do we know that the site won&#8217;t just immediately collapse under the load of (hopefully many) visitors hitting the site at the same time? The last thing we would want to do is to put them off by going down straight after launch. Some load testing is in order.</p>
<p>That said, we don&#8217;t want to spend much time and money on it either. It doesn&#8217;t need to be an enterprise-grade solution. All I want to do is check that we won&#8217;t fall over and die if we get lots of nice people coming to visit us.</p>
<p>I&#8217;m aware that with load tests, you need to be quite careful that you end up actually testing the server; it can happen easily that the bottleneck is actually somewhere on the client&#8217;s side. So part of this experiment was actually to test the load testing tool, not just the target application!</p>
<p>The tool I used here is <a href="http://httpd.apache.org/docs/2.2/programs/ab.html">ApacheBench</a>; there are others, like <a href="http://www.acme.com/software/http_load/">http_load</a> and <a href="http://www.joedog.org/JoeDog/Siege">siege</a>, which I may try another time. I booted up an <a href="http://aws.amazon.com/ec2/">EC2</a> instance specifically for running ApacheBench, to provide a clean &#8216;laboratory&#8217; environment without processes contending for CPU and I/O. I ran a variety of tests using different URLs, sending cookies (to simulate a logged-in user), with and without keep-alive etc. These parameters all changed the results a bit, but the general shape was the same, so here&#8217;s some data for one particular page (the auction listing view).</p>
<p>ApacheBench lets you set the level of concurrency, i.e. the number of connections it tries to make to the server at the same time. We have two EngineYard production slices (virtual machines), each running three <a href="http://mongrel.rubyforge.org/">Mongrels</a> (the single-threaded server daemons which run the application), so I expected that we should be able to handle six concurrent connections without any queueing of requests.</p>
<p>The first graph is for a test with four concurrent connections, i.e. the server should be pretty relaxed. It shows which proportion of pages were served in less than a particular time (i.e. the percentiles). The graph is generated from timings of 10,000 page views.</p>
<p style="text-align: center; "><a href="http://www.yes-no-cancel.co.uk/wp-content/uploads/2008/10/test4.png"><img class="size-full wp-image-169 aligncenter" src="http://www.yes-no-cancel.co.uk/wp-content/uploads/2008/10/test4.png" alt="Maximum response time for a given proportion of requests (server not saturated)" width="500" height="358" /></a></p>
<p>How to read this graph: e.g. see that the red line crosses 0.16 seconds at about 80%; that means that the client (ApacheBench) reported that 80% of requests were served in 0.16 seconds <strong>or less</strong>. The highest point at 100% is the longest time which it ever took in the test.</p>
<p>I measured four quantities: the time per page view reported by ApacheBench (the client), the total time per page view reported by the server, and the values for rendering time and database time reported by the server. (The blue line is the sum of the green and the orange lines.) I would call this pretty well-behaved: 84% of page requests are received by the client within 200 milliseconds, and 95% within 300ms; rendering time is quite variable while database time hardly ever exceeds 100ms. And the client times are just a constant amount above the server times, to account for the fact that the request and response have to go across a network, through a proxy/load balancer etc.</p>
<p>Each of those page views involves about 11 SQL queries and 10 partials being rendered; we can&#8217;t get much below that, since the page content is fairly complex. My guess is that you can&#8217;t do much better than these timings with Ruby on Rails out of the box; at the end of the day, it is a pretty slow platform. (I&#8217;m not saying that other languages/frameworks are any better &#8212; most likely, they are not.) When we find that this site needs to scale further, we will simply have to add more mongrels, database read slaves, and plenty of <a href="http://www.danga.com/memcached/">memcached</a> to the system.</p>
<p>By the way, in this test the 1-minute load average reported by the Linux kernel reached a maximum of 1.6 &#8212; not very much at all.</p>
<p>Now, what happens when we hit the site harder? In the next test I increased the concurrency parameter to 16, more than twice the number of mongrels. Now I would expect all mongrels to be busy all the time (saturated with requests), and response times to go up. And this is the graph we get:</p>
<p style="text-align: center; "><a href="http://www.yes-no-cancel.co.uk/wp-content/uploads/2008/10/test5.png"><img class="size-full wp-image-170 aligncenter" src="http://www.yes-no-cancel.co.uk/wp-content/uploads/2008/10/test5.png" alt="Maximum response time for a given proportion of requests (server saturated)" width="500" height="351" /></a></p>
<p>Ignore the red one for now. The shape of the three server-side curves is still pretty much the same as before, and overall they are taking 50% to 100% longer to respond compared to the test above. That is simply because the CPU is now fully occupied with all three mongrels, whereas previously there might only be one or two mongrels wanting CPU at the same time. When I ran further tests, increasing concurrency of the client to 64 (more than ten times the number of mongrels), these server curves stayed exactly the same, and server-side times didn&#8217;t increase any further. This is good news: although the server is overloaded, this doesn&#8217;t cause any loss of throughput (which might be the case if there were inefficiencies).</p>
<p>For a saturated server, the kernel load average was between about 3.0 and 4.0, no matter how many concurrent clients there were. This makes sense, since the load is determined by the length of the scheduler&#8217;s run queue, and if you only have 3 server processes (mongrels) plus a few of background housekeeping processes with small requirements, there&#8217;s no reason why it should go any higher.</p>
<p>The combined throughput of the mongrels in this test, and in those with higher concurrency, was pretty constant at 33 requests per second. Since that is across 2 CPUs, we must be taking 60ms of CPU time per request, or an average of 180ms wall-clock time (1 CPU is shared between 3 mongrels). The average request time (DB + Render) reported by the server is 164ms, so we have only 10% overhead somewhere which is not being accounted for. Nice to see that the numbers add up quite well. :-)</p>
<p>Now turn your attention to the red line, which is now completely different from before. There are still some page views which happen very quickly, but some are taking up to a whole second. What is happening here is that requests are getting queued up &#8212; if they are lucky, the queue is empty and they get served right away, but if it&#8217;s their bad day, they might have to wait in line after several other requests until they finally get to talk to a mongrel. The mongrels work at the same pace, no matter how long the queue is (think post office workers), so obviously waiting times will increase the more people/requests try to get in the door at the same time.</p>
<p>With 16 concurrent clients, the median response time was 0.45 seconds, and 99% of requests were served within 1.3 seconds. However, with 64 concurrent clients, the median was 1.8 seconds and the 99% percentile was a whopping 6.2 seconds. (This is the point where users get rather impatient.) I&#8217;m not sure exactly what the relationship between concurrency and waiting time is; maybe with some more experiments and some theory I can work it out. (I took a <a href="http://www.cl.cam.ac.uk/teaching/2005/CompSysMod/">course on queueing theory</a> at university, which covers exactly this kind of system, but I can&#8217;t remember much of it. If I have time I&#8217;ll dig it out again and see how I might be able to model traffic to a web application with lots of nice maths&#8230;)</p>
<p>The thing which struck me in this graph is quite how straight that red line is; I had expected the response times to be less widely spread. This prompted me to have a look at the actual distribution/histogram of response times as reported by the client.</p>
<p>Another thing I wanted to check was how good ApacheBench&#8217;s concurrency setting actually was &#8212; how could I trust that it really keeps the server as busy as it claims to? And maybe it makes requests in certain regular patterns which might skew the results. So I ran the following 3 tests in a side-by-side comparison:</p>
<ol>
<li>One ApacheBench process set to make 64 concurrent requests, running on an EC2 instance, no keep-alive, making 10,000 requests.</li>
<li>Four ApacheBench processes, each set to make 16 concurrent requests, running on the same EC2 instance, no keep-alive, each making 2,500 requests.</li>
<li>Four ApacheBench processes, each set to make 16 concurrent requests, each running on its own EC2 instance, no keep-alive, each making 2,500 requests.</li>
</ol>
<p>The server-side statistics for the three tests, including the total throughput, were identical. However, interestingly, there was a noticeable difference between the distribution of response times reported by ApacheBench in the three cases. I have plotted them below:</p>
<p>(This diagram is what you get if you flip the one above by a diagonal axis and then differentiate the function by response time. Note also that the colours now have a different meaning.)</p>
<p style="text-align: center; "><a href="http://www.yes-no-cancel.co.uk/wp-content/uploads/2008/10/distribution.png"><img class="size-full wp-image-168 aligncenter" src="http://www.yes-no-cancel.co.uk/wp-content/uploads/2008/10/distribution.png" alt="Distribution of client-side response times" width="500" height="364" /></a></p>
<p>I don&#8217;t quite know yet what to make of this. Ok, the behaviour doesn&#8217;t differ too drastically, so if you just want a rough idea of the performance of your application, it seems like ApacheBench&#8217;s concurrency option is perfectly fine. I am just a bit intrigued by the differences.</p>
<p>The green line is most like what I expected, because it most resembles an exponential distribution, which is what queueing theory predicts in many cases. Interestingly, this is the setup in which the four ApacheBench processes are most likely to get in each other&#8217;s way, contending for I/O on the EC2 instance &#8212; maybe this causes some jitter, blurring an otherwise regular pattern. In the two cases with one ApacheBench process per machine (blue and yellow lines) the distribution is more flat between about 1 and 3 seconds response time; in particular, the blue and yellow setup are noticeably more likely to see 2-3 second response times than the green setup.</p>
<p>If anybody has ideas on how to interpret this, please let me know. I should probably also repeat those experiments with a larger sample size and work out if the differences are actually statistically significant, but I don&#8217;t have time for that at the moment.</p>
<p><strong>SCRIPTS</strong></p>
<p>In order to produce these statistics, I wrote a few simple shell scripts to gather and process the data. I&#8217;ll put them here in case somebody finds them useful (and so that I can find them again when I need them next time!).</p>
<p>First I have two scripts which run on the servers with the mongrels. In our setup, each virtual machine has its own logfile, so the scripts need to be run on each virtual machine. They select the portion of the logfile which was written during the duration of the test, and also log load averages from the kernel. Set an environment variable like <strong>export LOGFILE=/path/to/my/production.log</strong> before running these scripts. The first is called <strong>before-test.sh</strong> and should be run before the test starts:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="code"><pre class="bash bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#!/bin/sh</span>
<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #000000; font-weight: bold;">!</span> <span style="color: #660033;">-f</span> <span style="color: #ff0000;">&quot;$LOGFILE&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span> <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;LOGFILE not found&quot;</span>; <span style="color: #7a0874; font-weight: bold;">exit</span> <span style="color: #000000;">1</span>; <span style="color: #000000; font-weight: bold;">fi</span>
<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;time,1min,5min,10min,running,procs,lastproc&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;</span> <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>loadavg.csv
<span style="color: #7a0874; font-weight: bold;">&#40;</span> <span style="color: #000000; font-weight: bold;">while</span> <span style="color: #c20cb9; font-weight: bold;">true</span>; <span style="color: #000000; font-weight: bold;">do</span>
    <span style="color: #007800;">ts</span>=<span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">date</span> <span style="color: #ff0000;">'+%Y%m%d%H%M%S'</span><span style="color: #000000; font-weight: bold;">`</span>
    <span style="color: #007800;">load</span>=<span style="color: #ff0000;">&quot;`cat /proc/loadavg | tr ' /' ','`&quot;</span>
    <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;$ts,$load&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;&gt;</span> <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>loadavg.csv
    <span style="color: #c20cb9; font-weight: bold;">sleep</span> <span style="color: #000000;">1</span>
<span style="color: #000000; font-weight: bold;">done</span> <span style="color: #7a0874; font-weight: bold;">&#41;</span> <span style="color: #000000; font-weight: bold;">&amp;</span>
<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #007800;">$!</span> <span style="color: #000000; font-weight: bold;">&gt;</span> <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>loadavg.pid
<span style="color: #c20cb9; font-weight: bold;">wc</span> <span style="color: #660033;">-l</span> <span style="color: #007800;">$LOGFILE</span> | <span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{print $1}'</span> <span style="color: #000000; font-weight: bold;">&gt;</span> <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>skip_log_lines</pre></td></tr></table></div>

<p>And <strong>after-test.sh</strong> should be run when the test has ended:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
</pre></td><td class="code"><pre class="bash bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#!/bin/sh</span>
<span style="color: #c20cb9; font-weight: bold;">kill</span> <span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">cat</span> <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>loadavg.pid<span style="color: #000000; font-weight: bold;">`</span>
<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;total,render,db,url&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;</span> <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>requests.csv
<span style="color: #007800;">skip</span>=<span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">cat</span> <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>skip_log_lines<span style="color: #000000; font-weight: bold;">`</span>
<span style="color: #c20cb9; font-weight: bold;">tail</span> <span style="color: #660033;">-n</span> +<span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">expr</span> <span style="color: #007800;">$skip</span> + <span style="color: #000000;">1</span><span style="color: #000000; font-weight: bold;">`</span> <span style="color: #007800;">$LOGFILE</span> | <span style="color: #c20cb9; font-weight: bold;">grep</span> <span style="color: #ff0000;">'^Completed in'</span> | \
    <span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{print $3 &quot;,&quot; $8 &quot;,&quot; $12 &quot;,&quot; $17}'</span> <span style="color: #000000; font-weight: bold;">&gt;&gt;</span> <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>requests.csv
<span style="color: #c20cb9; font-weight: bold;">rm</span> <span style="color: #660033;">-f</span> <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>loadavg.pid <span style="color: #000000; font-weight: bold;">/</span>tmp<span style="color: #000000; font-weight: bold;">/</span>skip_log_lines</pre></td></tr></table></div>

<p>As you can see, it filters the processing times reported by the server out of the logfile and formats them as CSV for post-processing in your favourite spreadsheet application.</p>
<p>To execute the test (potentially with several processes at the same time), I ran something like the following on an EC2 instance:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="code"><pre class="bash bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">apt-get</span> update
<span style="color: #c20cb9; font-weight: bold;">apt-get</span> <span style="color: #660033;">-y</span> dist-upgrade
<span style="color: #c20cb9; font-weight: bold;">apt-get</span> <span style="color: #660033;">-y</span> <span style="color: #c20cb9; font-weight: bold;">install</span> apache2-utils
<span style="color: #c20cb9; font-weight: bold;">mkdir</span> loadtest
<span style="color: #7a0874; font-weight: bold;">cd</span> loadtest
<span style="color: #007800;">TESTS</span>=<span style="color: #ff0000;">&quot;list1 list2 list3 list4&quot;</span>
<span style="color: #007800;">COOKIE</span>=<span style="color: #ff0000;">&quot;-C _session_id=12345678901234567890&quot;</span>
<span style="color: #007800;">HOST</span>=<span style="color: #ff0000;">&quot;http://staging.example.com/&quot;</span>
<span style="color: #007800;">AB</span>=<span style="color: #ff0000;">&quot;ab -n 10000 -c 16&quot;</span> <span style="color: #666666; font-style: italic;"># 10,000 requests, concurrency 16 per process</span>
<span style="color: #007800;">$AB</span> <span style="color: #660033;">-g</span> list1.log <span style="color: #007800;">$COOKIE</span> <span style="color: #ff0000;">&quot;${HOST}/path/to/test&quot;</span> <span style="color: #000000; font-weight: bold;">&amp;</span>
<span style="color: #007800;">$AB</span> <span style="color: #660033;">-g</span> list2.log <span style="color: #007800;">$COOKIE</span> <span style="color: #ff0000;">&quot;${HOST}/path/to/test&quot;</span> <span style="color: #000000; font-weight: bold;">&amp;</span>
<span style="color: #007800;">$AB</span> <span style="color: #660033;">-g</span> list3.log <span style="color: #007800;">$COOKIE</span> <span style="color: #ff0000;">&quot;${HOST}/path/to/test&quot;</span> <span style="color: #000000; font-weight: bold;">&amp;</span>
<span style="color: #007800;">$AB</span> <span style="color: #660033;">-g</span> list4.log <span style="color: #007800;">$COOKIE</span> <span style="color: #ff0000;">&quot;${HOST}/path/to/test&quot;</span> <span style="color: #000000; font-weight: bold;">&amp;</span>
<span style="color: #7a0874; font-weight: bold;">wait</span>
<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #7a0874; font-weight: bold;">test</span> <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #007800;">$TESTS</span>; <span style="color: #000000; font-weight: bold;">do</span>
    <span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #660033;">-F</span> <span style="color: #ff0000;">'<span style="color: #000099; font-weight: bold;">\t</span>'</span> <span style="color: #ff0000;">&quot;{print <span style="color: #000099; font-weight: bold;">\$</span>4 <span style="color: #000099; font-weight: bold;">\&quot;</span>,<span style="color: #000099; font-weight: bold;">\&quot;</span> <span style="color: #000099; font-weight: bold;">\$</span>5 <span style="color: #000099; font-weight: bold;">\&quot;</span>,<span style="color: #000099; font-weight: bold;">\&quot;</span> <span style="color: #000099; font-weight: bold;">\$</span>6 <span style="color: #000099; font-weight: bold;">\&quot;</span>,$test<span style="color: #000099; font-weight: bold;">\&quot;</span>}&quot;</span> <span style="color: #000000; font-weight: bold;">&lt;</span> <span style="color: #007800;">$test</span>.log | <span style="color: #c20cb9; font-weight: bold;">tail</span> <span style="color: #660033;">-n</span> +<span style="color: #000000;">2</span> <span style="color: #000000; font-weight: bold;">&gt;</span> <span style="color: #007800;">$test</span>
<span style="color: #000000; font-weight: bold;">done</span>
<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;dtime,ttime,wait,test&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;</span> bench.csv
<span style="color: #c20cb9; font-weight: bold;">cat</span> <span style="color: #007800;">$TESTS</span> <span style="color: #000000; font-weight: bold;">&gt;&gt;</span> bench.csv</pre></td></tr></table></div>

<p><span>Then copying and aggregating all the logs onto my machine for making pretty graphs:</span></p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="code"><pre class="bash bash" style="font-family:monospace;"><span style="color: #007800;">TEST</span>=test7
<span style="color: #007800;">CLIENTS</span>=<span style="color: #ff0000;">&quot;ec2-75-101-204-213&quot;</span>
<span style="color: #007800;">KEYFILE</span>=<span style="color: #ff0000;">&quot;path/to/private/key/file/for/ec2/instance&quot;</span>
<span style="color: #000000; font-weight: bold;">for</span> host <span style="color: #000000; font-weight: bold;">in</span> prod1 prod2; <span style="color: #000000; font-weight: bold;">do</span>
    <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #c20cb9; font-weight: bold;">file</span> <span style="color: #000000; font-weight: bold;">in</span> loadavg requests; <span style="color: #000000; font-weight: bold;">do</span>
        <span style="color: #c20cb9; font-weight: bold;">scp</span> <span style="color: #ff0000;">&quot;ey-$host:/tmp/$file.csv&quot;</span> <span style="color: #ff0000;">&quot;$file-$host.csv&quot;</span>
    <span style="color: #000000; font-weight: bold;">done</span>
<span style="color: #000000; font-weight: bold;">done</span>
<span style="color: #000000; font-weight: bold;">for</span> client <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #007800;">$CLIENTS</span>; <span style="color: #000000; font-weight: bold;">do</span>
    <span style="color: #c20cb9; font-weight: bold;">scp</span> <span style="color: #660033;">-i</span> <span style="color: #007800;">$KEYFILE</span> root<span style="color: #000000; font-weight: bold;">@</span><span style="color: #007800;">$client</span>.compute-1.amazonaws.com:loadtest<span style="color: #000000; font-weight: bold;">/</span>bench.csv client-<span style="color: #007800;">$client</span>.csv
<span style="color: #000000; font-weight: bold;">done</span>
<span style="color: #c20cb9; font-weight: bold;">mkdir</span> <span style="color: #007800;">$TEST</span>
paste <span style="color: #660033;">-d</span> <span style="color: #ff0000;">','</span> loadavg-prod1.csv loadavg-prod2.csv <span style="color: #000000; font-weight: bold;">&gt;</span> <span style="color: #007800;">$TEST</span><span style="color: #000000; font-weight: bold;">/</span>loadavg.csv
<span style="color: #c20cb9; font-weight: bold;">cat</span> requests-prod<span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #000000;">12</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>.csv <span style="color: #000000; font-weight: bold;">&gt;</span> <span style="color: #007800;">$TEST</span><span style="color: #000000; font-weight: bold;">/</span>requests.csv
<span style="color: #c20cb9; font-weight: bold;">cat</span> client<span style="color: #000000; font-weight: bold;">*</span>.csv <span style="color: #000000; font-weight: bold;">&gt;</span> <span style="color: #007800;">$TEST</span><span style="color: #000000; font-weight: bold;">/</span>clients.csv
<span style="color: #c20cb9; font-weight: bold;">mv</span> loadavg-prod<span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #000000;">12</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>.csv requests-prod<span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #000000;">12</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>.csv client<span style="color: #000000; font-weight: bold;">*</span>.csv <span style="color: #007800;">$TEST</span></pre></td></tr></table></div>

<p><span>Obviously these scripts are still pretty rough around the edges, but they did the job of being simple and telling me what I wanted to know.</span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2008/10/27/load-performance-testing-a-rails-application-with-apachebench/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Decision making for experts</title>
		<link>http://www.yes-no-cancel.co.uk/2008/08/02/decision-making-for-experts/</link>
		<comments>http://www.yes-no-cancel.co.uk/2008/08/02/decision-making-for-experts/#comments</comments>
		<pubDate>Sat, 02 Aug 2008 21:30:38 +0000</pubDate>
		<dc:creator>Johannes Hauser</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=98</guid>
		<description><![CDATA[Did you know that there is such thing as a World Rock Paper Scissors Society which does hold regular tournaments and country championships and even a world championship (where the world champion will be awarded 10,000 $)? Maybe this way of decision making is strongly underrated.]]></description>
			<content:encoded><![CDATA[<p>Did you know that there is such thing as a <a href="http://www.worldrps.com">World <em>Rock Paper Scissors</em> Society</a> which does hold regular tournaments and country championships and even a world championship (where the world champion will be awarded 10,000 $)?</p>
<p>Maybe this way of decision making is strongly underrated.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2008/08/02/decision-making-for-experts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hermann Bondi: Arrogance of certainty</title>
		<link>http://www.yes-no-cancel.co.uk/2008/06/10/hermann-bondi-arrogance-of-certainty/</link>
		<comments>http://www.yes-no-cancel.co.uk/2008/06/10/hermann-bondi-arrogance-of-certainty/#comments</comments>
		<pubDate>Tue, 10 Jun 2008 19:45:10 +0000</pubDate>
		<dc:creator>Martin Kleppmann</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=91</guid>
		<description><![CDATA[A few years ago, I was discussing the tensions between relativism and religion with a friend. In a vastly simplified nutshell, relativism is an understanding of the world which is founded on the principle that everything we know and perceive is relative to our own person (e.g. we perceive the whole world through our senses, [...]]]></description>
			<content:encoded><![CDATA[<p>A few years ago, I was discussing the tensions between <a href="http://en.wikipedia.org/wiki/Relativism">relativism</a> and religion with a friend. In a vastly simplified nutshell, relativism is an understanding of the world which is founded on the principle that everything we know and perceive is relative to our own person (e.g. we perceive the whole world through our senses, which are known to be fallible) and that there can therefore be no such thing as absolute truth. Think of the artificial reality of <a href="http://www.imdb.com/title/tt0133093/">The Matrix</a> as an extreme example. (The tension between relativism and religion is a topic I continue to be interested in, but it&#8217;s so complex that I&#8217;ve come to think that understanding it is more or less a lifetime project. I&#8217;ve certainly not even scratched the surface, let alone made up my mind.)</p>
<p>In that context I was told that I had to read a certain article by <a href="http://en.wikipedia.org/wiki/Herman_Bondi">Sir Hermann Bondi</a> (Physicist, 1919&#8211;2005) entitled <em>&#8220;Arrogance of certainty&#8221;</em>, because it apparently forms the basis of every discussion of the topic. I was given a copy of the article on paper, and I liked it because of its lucid and clear writing style. I kept it, even though the paper got quite crumpled in my bag at some point. </p>
<p>Recently I was looking for the article again and searched for it on Google. To my great surprise, I couldn&#8217;t find any trace of it, let alone the whole text. Therefore I want to re-publish it here so that others may find it online in future. Unfortunately I have no idea where and when it was originally published &#8212; all I have is a crumpled photocopy of a photocopy of a newspaper cutting. If you have any further information, please let me know.</p>
<p>I neither fully agree or disagree with this article, but I think it is well worth reading. Without any further comment, here it is.</p>
<blockquote><p>Hermann Bondi</p>
<p><strong>Arrogance of certainty</strong></p>
<p>I am a non-believer in any revealed religion and a scientist. In my acquaintance with scientists I find both belief and non-belief. I know sufficient numbers of scientists of each persuasion to be willing to classify two statements as both being stupid and palpably untrue prejudices, viz, that a person, being a scientist must accordingly be a believer in a revealed religion, or the opposite statement.</p>
<p>Any thinking person must be struck with awe and wonder on contemplating the mystery and complexity of our universe. We scientists have somewhat enlarged our modest island of understanding that is surrounded by a huge ocean of ignorance. Some feel that there must be an intelligence, an architect of all this grandeur, an architect that may be called God, but without ascribing to this unknown entity any interest in our human affairs or in our prayers. (If I rightly understand, this was Einstein&#8217;s view.)</p>
<p>There are also people who believe, as a generalized feeling, that this entity, this God, in some undefined way responds to their trouble and their prayers without claiming that they have any describable or communicable knowledge of this their God. Again, there is revealed religion, the belief that God in some way, different for different religions, revealed himself in some precise communicable manner conveying some absolute truth.</p>
<p>I have no quarrel with the first three stages, but I regard the widespread human tendency to have firm faith in a revealed religion as one of our most negative traits. Indeed, I do not call myself an atheist, but an anti-revelationist. To call myself an atheist would mean denying an entity so differently defined by different people that the denial is meaningless. Some say God is love. I would certainly not wish to deny love. Some say God is nature. I am not so absurd as to deny nature. But the certainty involved in revelation horrifies me, and the historical record of the deeds done in the name of such revelations bears me out.</p>
<p>If one looks at religiosity, the immediate staggering fact is that different people believe firmly and fervently in different and, in many respects, contradictory religions. The very variety of faiths is remarkable, yet each can claim adherents of the highest integrity, sincerity and honesty, utterly convinced of their belief. How anyone can have the arrogance to think that their own belief is right and anybody who thinks differently is wrong passes my comprehension. Surely the overwhelming evidence is that the human mind has the tendency to believe firmly but incorrectly, since at most one of the many competing revealed religions can be right.</p>
<p>Nor am I much impressed by what some regard as threads common to different major religions as regards their theory. What has a non-theistic faith like Buddhism in common with a theistic one like Islam? Why are we to stress likenesses now when people have fought to the death over minute differences in their religions?</p>
<p>There is indeed a common morality among all of us humans, enshrined in the golden rule that one should do to others only as one would have done to oneself. I see this founded in our common humanity, which is why I call myself a Humanist. I see this common morality sometimes supported by religion, sometimes perverted by it. Above all we need to strengthen all that unites us with other humans, where religion so readily divides us. This division by faiths, so often pursued with the utmost cruelty, is what surely we should strive to heal, by relegating religion from the public domain to that of individual belief or non-belief.</p>
<p>What I abhor about revealed religion is its supposed absolute certainty. It is here that I see the real conflict between science and religion. In science we know that our understanding, our theories are only provisional, and liable to be upset by experiment and observation. On this basis, so well described by Karl Popper, science has indeed acquired universality with people of different cultures, ideologies, races etc able to cooperate. Science is so successful in this because it is attuned to the basic human characteristic of fallibility. It is the inhuman certainty a believer feels in revelation that is so obnoxious and harmful and unacceptable as a basis for morality.</p>
<p>Of course we must recognize the great role religion has played in history but need not support it. Being an anti-revelationist is in no way arid. It allows one to enjoy freely all that human genius has produced; it allows one to engage untrammelled in the search that is the real joy of living.</p>
<p><em>Sir Hermann Bondi, FRS, was Master of Churchill College, Cambridge.</em></p>
</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2008/06/10/hermann-bondi-arrogance-of-certainty/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Comeback from the stone age</title>
		<link>http://www.yes-no-cancel.co.uk/2008/04/29/comeback-from-the-stone-age/</link>
		<comments>http://www.yes-no-cancel.co.uk/2008/04/29/comeback-from-the-stone-age/#comments</comments>
		<pubDate>Tue, 29 Apr 2008 12:27:52 +0000</pubDate>
		<dc:creator>Johannes Hauser</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.yes-no-cancel.co.uk/?p=88</guid>
		<description><![CDATA[Seen this morning: PENNY, a german supermarket chain, sells typewriters.]]></description>
			<content:encoded><![CDATA[<p>Seen this morning: PENNY, a german supermarket chain, sells <strong>typewriters</strong>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yes-no-cancel.co.uk/2008/04/29/comeback-from-the-stone-age/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
