Nabla is Not A Blog, Alright? It's just my random online ramblings, and any similarity is obviously some kind of coincidence. Nothing to see here, move along...
Notes on being a software developer 2: Things to learn from books etc.
First, the fundamentals. If you've got the right mindset, and you know what you want the machine to do, you can probably write something to make it do it in some half-hearted way. The trick is to make it do it fast (algorithmic complexity) and well (formalise the problem, consider corner cases, implement to spec). And that is algorithms, the core of the subject.
Most developers will, admittedly, spend most of their time gluing together other people's libraries. But abstractions leak and you'll need to look inside the black box. You may well need to write your own libraries. And remember that nothing a computer does is magic, so you should be able to get up to your elbows in gubbins on any part. So, even if your day to day job isn't coding clever algorithms, it should be your bread and butter. Put another way, do you want to be a mechanic, or an engineer?
So, algorithms, algorithms, algorithms. That includes data structures and complexity, and all that. I give two thumbs up to Cormen et al's Introduction to Algorithms. It covers the basics and does some very interesting advanced topics, and has lots of references to papers for further stuff.
As it provides a rich toolkit of ideas, plus useful vocabulary, add basic computer science theory. Computation theory with regular languages and context-sensitive grammars, lambda calculus, that kind of thing. First order predicate calculus. Probably learn ML, Haskell and/or Scheme.
Eventually, it all hits hardware, so learn computer architecture. Hennessy and Paterson will do the trick. Great book. Readings in Computer Architecture was also pretty awesome. Digital logic and maybe a little VLSI help round it out. Abstractions leak, so it's still useful if you never pick up a soldering iron.
Then there are the other staples of a traditional computer science undergrad degree: Graphics, security and cryptography, networking, compilers, databases, etc. They pretty much all have interesting ideas associated with them. Learn a bit of all of them, and dig deeper on the ones that pick your fancy.
You've probably noticed that I'm being heavy on the theory, and very light on the specific technologies. That's because it really is all about the ideas. The technologies keep changing, and while ideas change too, it's at a much slower rate, and it allows you to see those things that are invariant.
Learning specific language is necessary, of course. They're fundamental tools, and if you can use a language efficiently, things are pretty bad. Moreover, some languages are really effective ways of getting across concepts. On the other hand, some languages are sufficiently baroque that people become experts in the language and its tricks, not programming. *cough* C++ *cough* People can end up thinking they're designing in the abstract, when in fact they're designing in a particular language. Obviously, what you design must be implementable in the language used, and sometimes it makes sense to play to the language's features and idioms, but fundamentally, you should be designing something that architecturally makes sense even to someone who doesn't know the language. Again, as an example, some people have their concept of what object orientation should be broken by C++.
So, learn languages, but I'd recommend wide rather than deep. Both is best.
Then there are specialised algorithmic areas that have some really interesting stuff, but don't show up on the wider radar much. Good psuedorandom number generators and hashes. Compression and error detecting/correcting codes. That kind of stuff. Between my undergrad and now, machine learning has become a real subject with lots of interesting things to learn. Outside but near to computer science, there's plenty of discrete maths. Play with stuff like game theory, perhaps?
A lot of this is about getting the best possible toolkit to think with. So, practice is useful. There are lots of puzzly things that both give you practice, and may throw you new stuff too. Project Euler is fun, as are competitions like Google's Code Jam, and there is an apparently never-ending set of interview puzzles, all of which make good exercises. I'm sure you can find more.
This has been fairly algorithmically-focused. There's a lot of process and design stuff that is incredibly important, but I'm not sure it's something that's easy to pick up from books. There are books like Code Complete which I guess are worth reading, but I think this is particular area where experience seems to do its thing, or at least it's not well-enough served by books.
On the other hand, I'm actually quite positive about books on management and leadership. Too many people treat managing teams or leading projects as this incidental thing they fall into. Whether or not you like it or are interested in it, if you're going to do it, do it right. As managers like to write documents, there are plenty of decent business books in this area, and it's worth reading a few. Even if you're not managing, you can understand what your manager should be doing, how to make their life easier, and use them to make your life easier. Books can be useful because they allow you, even in a possibly dysfunctional management environment, to understand how things can and should work.
I guess the idea here is to be an idea infovore. Try to keep learning new things, but particularly favour those things that give you new ways of looking at the world, new tricks, and ways to get to the nub of it all.
Permalink. Posted 22:12, Sun, 19 May 2013.
Paper round: Java and trading
A number of months ago, I moved to a team which does algorithmic trading in Java. I had little expertise in algorithmic trading or Java, so I did a little bit of reading. Not a huge amount, but just enough to get an idea of the field. I also just read a bunch of papers that seemed fun that I hadn't got around to otherwise.
Making a Faster Cryptoanalytic Time-Memory Trade-Off - Philippe Oechslin This paper describes rainbow tables, which is a space efficient way of storing password reverse look-up tables. I'd seen them about the place, including being referenced by (but not understood by) Cory Doctorow - see the awful Knights of the Rainbow Tables for example. I'd read the Wikipedia explanation, but it never really clicked. So, I read the original paper, and it made sense for a while. It's slightly fallen out of my head now, but it clicked nicely for a while. :)
Implementing lazy functional languages on stock hardware: the Spineless Tagless G-machine - Simon Peyton Jones A while ago, I tried reading SPJ's book on lazy language implementations, which went via spines and all the rest. I move incrementally from finding it heavy going to getting lost. Given that the STG is core to GHC, and Haskell was my day job for so long, I thought I owed it to try to understand the STG. So, I read this extended paper, and it's very straightforward! Without spines and all that, and devoid of the memoisation part, the non-strict evaluation is quite clear, and the virtual machine it describes is not far off a strict machine. The adding of updates for memoising results is clever and subtle, but it written in a way that makes it feel not totally impossible. It's a really well-written exposition that makes the whole thing come together for me.
JSR 133 (Java Memory Model) FAQ - Jeremy Manson and Brian Goetz As the algo trading system I work with is (unsurprisingly) multithreaded, and the Java memory model is new since I last used Java in anger (i.e. around the turn of the millennium), I thought it would be a good idea to get up to speed on what Java guarantees. So, this document provides a nice summary of that.
JSR-133: Java Memory Model and Thread Specification This is the official spec itself. It's got a formal model. However, I found it quite disappointing. The informal rationale didn't really make things clear, and the formal model was simultaneously a bit woolly in places, and actually surprisingly unintuitive. It could just be that I'm a bit dumb. However, I like to pretend that I'm not, and even if I were, a decent write-up would allow me to understand it easilt enough anyway.
The JSR-133 Cookbook for Compiler Wrters - Doug Lea And this one tries to cover everything again from another angle - that of a compiler writer. A slightly interesting alternative, but not really terribly insightful.
Generics in the Java Programming Language - Gilad Bracha The other particularly funky thing that turned up in Java was generics. The basic idea of generics is simple. The full subtle complexity is... subtle and complex. You can do most stuff without worrying about it. I hoped this document might cover the other side. It didn't. I think I might need to go to the spec.
The Reyes Image Rendering Architecture - Cook, Carpenter and Catmull That's Java, so now a random other paper. And it's a bit of a classic. Reyes is the algorithm behind Renderman, Pixar's rendering system. So, it's a kinda lightweight, high-level paper, but it does give you the flavour of it, micro-polygons and all. It's all rather interesting, and well worth reading.
NVIDIA Fermi compute architecture Not actually related to Reyes! So, in my last role, we were looking at GPUs for computation. I realised that I didn't really know what the architecture involved really was. Turns out, it's basically a multiprocessor system where each processor is itself a SIMD processor with multithreaded scheduling (a la Hyperthreading). Not a surprise, I guess, but fun to understand where the state of the art is/was.
Ecology of the Modern Institutional Spot FX: The EBS Market in 2011 - Anatoly Schmidt EBS is one of the online exchanges. Some people access it through GUIs, others through algorithmic trading systems. It's interesting to see how it's used, and who makes money off who, etc. Also, it's interesting to see the size of the market, the overall structure, etc. This paper focusses on liquidity in this market, but does provide some other interesting background.
The Trading Profits of High Frequency Traders - Baron, Brogaad, Kirilenko This paper tries to work out who profits from whom, in terms of traders they categorised as aggressive, medium and passive HTFs, fundamental traders, small traders, non-HFT market makers and opportunistic traders. They did this by analysing data from S&P 500 eMini trading. It does this competently enough, not really revealing anything that will help make profits, but at least making the structure of the market that little bit clearer.
Does Academic Research Destroy Stock Return Predictability? - McLean and Pontiff This paper looks at published studies of stock market anomalies, and investgiates whether the returns decrease once they're published, presumably as they're arbitraged away. It seems they do. Not a real surprise. What's impressive is how non-commercial this kind of research is. Not only does it not actually produce a convenient list of stock market anomalies you might trade off, but it doesn't look at these anomalies in terms of risk-reward tradeoffs. So, it's not clear what the limits of arbitraging away are - has it just hit a level where the risk isn't worth it, or what? Useless social science research. Gah.
Is Everything We Know About Password Stealing Wrong? - Florencio and Herley This rather random paper is a simple one-trick pony. It points out that because banks can mostly undo disputed transactions, online theft mostly consists of stealing money from mules ('work from home, processing transactions!'), and it's waaay harder to find mules than compromised accounts. While it's not deep, it provides a fairly sensible reality check on the economics of password stealing. Fun.
A Parameter-Free Hedging Algorithm - Chaudhuri, Freund and Hsu I saw a presentation where this approach was mentioned for an online-learning decision theoretic system. It made sense at the time. Then I read this paper. It describes the simple algorithm, succinctly. It proves useful properties in an appendix. What it doesn't do is motivate the underlying problem. It explains the advance over the previous unexplained state of the art. As I'm lacking all the background in this area, I'm completely missing the point of what this is about and why I should care!
The Influence of Organizational Structure on Software Quality: An Empirical Case Study - Nagappan, Murphy and Basili Conway's law says organisations produce systems that mirror their organisational/communications structure. This paper extends that idea - that bug defect rates can be predicted from organisational information, when related to source control information - so if code is edited by lots of different people in various unrelated teams, you know you've got problems. The 'empirical case study' in question is Windows Vista. They claim good results, even compared to other metrics that look at the code itself, but it never really convinces me. There's some correlation/causation stuff going on here - I'm not quite sure if the analysis is picking up the right stuff, but it does make a very fun formalised anecdote.
Searching for Build Debt: Experiences Managing Technical Debt at Google - Morgenthaler, Gridnev, Sauciuc and Bhansali 'Technical Debt' is such a great phrase - it's actually pretty good at getting across to management why, just because it works right now, not tidying things up is a bad thing and will bite us in the future. All projects get technical debt. Well, either that or they're not real world projects, and probably won't ever be used. :p So, how to manage technical debt is a bit of an issue, and any ideas how others do it are interesting. Sadly, this paper covers a few very specific cases, and so falls rather short of my hopes.
CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code - Li, Lu, Myagmar and Zhou Another perennial issue is code duplication. At least it is where I work - I work with mathematicians and scientists who aren't necessarily highly experienced with good software development practices. And sometimes, they're up against tight deadlines. So, copy-and-paste is a way of life. So, code bloats, bugs are only fixed in one of several locations, etc. There are tools out there to analysis this stuff, so I thought I'd read up on them. This paper starts inauspiciously, but is actually rather good. The algorithm does some neatish stuff, the analysis shows more-or-less what you'd expect (most of the copy-and-paste in the Linux kernel is in drivers), and it does a rather neat thing of spotting copy-and-paste related bugs. The only downside is that the actual CP-Miner software is only available commercially, and the open source equivalents are nothing like as good. Ho hum.
Permalink. Posted 09:48, Sun, 12 May 2013.
Christminster, completed
I have finally completed Gareth Rees's highly recommended
After that, I zipped through the rest. Having spent so long, I was willing to take hints. The telephone puzzle is actually technically broken (grr), and the rusty key was just me forgetting about one aspect of the game. Finally, completion! What next?
Permalink. Posted 22:29, Thu, 09 May 2013.
Notes on being a software developer 1: Learning stuff
I'm writing up my thoughts on, if not how to be a better developer, at least how to be more like a developer who thinks like I do. I'll leave it up to you to decide whether this is a good thing or not.
Rather embarassingly, one of my strongest beliefs is that you can't make a good developer through paint-by-numbers, despite what I'm doing here being to create a big piece of paper with numbered regions. I guess what I'm hoping to produce is a guide for those with the potential, to help them shortcut the process.
Or, alternatively, I can give up the pretence that it's to help some hypothetical junior developer, and say it's a braindump of my worldview, because I'm arrogant. Doesn't really matter, does it?
Anyway, this is quite long, so I'll break it into chunks...
Learning stuff
- Learn from your mistakes Some people have n years of experience. A lot of people seem to have one year's worth of experience, repeated over and over. If you don't learn from your mistakes, it's not going to be much of a career. Learning from successes is actually a little dangerous - there are often multiple ways of doing things, and sticking with one that worked ok in the past may stop you from exploring even better solutions. This is not suggesting change for change's sake, but be wary of ignoring better solutions, having found a "good enough" one. With failures, the cause is often clear.
- Learn from others' mistakes If you only learn from your own experience, you can only gain one lifetime's worth of experience, at the rate of one year per year. If you can properly learn from others, you can learn so much more, and moreover you won't actually have to spend the time doing it wrong. When someone tells you "don't do it that way", it's very possible they're wrong (see other rules), but also that they have some experience or knowledge that you can learn off.
- Learn who to trust Vociferousness, experience and intelligence are all impressively unreliable indicators of whether someone's opinions are trustworthy or useful! Some combination of coherence and logic to their thoughts, effective domain-specific experience, empirical evidence and self-knowledge of the speaker might all be more helpful indicators, but it's difficult to articulate. Which is a shame, as it's really important to choose the right people to listen to (and no, "people I agree with" is not a good criterion unless you want to get stuck in a rut). But you trust me, right?
- Learn outside work This isn't a workaholic "do work at home" thing. This is just having "learn new stuff" as a hobby, something you do all the time. Sometimes it's tangential to work, sometimes it's unrelated, sometimes it's basically work stuff that's interesting and you don't want to leave it in the office (if there's nothing at work like that, get a better job). Always keep learning.
- Learn inside work Often, you'll accidentally learn things as a side effect of doing your job. Learning new stuff should be a big part of your job. It makes you more effective and more valuable, so it's a good use of time. Colleagues will often have practical knowledge that just isn't available in any books. Clearly it's more useful to learn the generally applicable things, rather than workplace-specific minutae, but there you go. Part of this is being very willing to ask questions, even if they sound dumb. Generally, if you phrase it in terms of wanting to clarify things, to make sure your inferred model is actually right, it doesn't sound so dumb. Especially if it reveals that you misunderstood something, or that the person you're speaking to doesn't really understand it either!
- Learn from the good stuff As well as finding clever people you trust to learn from, learn from the writings of the best. Find out what the best books are, and read them. Generally they're written by the real experts in the fields, and they're well-written, clear, comprehensive and just awesome. Why would you pick up some "in 24 hours for dummies" book, when you could get a Cormen et al or Hennessy and Patterson? Once you hit the end of standard textbooks, start reading the best research papers. Open source gives you the opportunity to read and learn from high quality code, and I should do more of that (but choose wisely!). And nowadays, there are online courses from some really decent academics (although the standard mostly seems to stay around first year undergrad), and some really funky research-level blogs. So much to learn.
- Learn hard Read up on how Ben Franklin taught himself stuff. It's astonishing. If you can even vaguely emulate that... wow.
I'm pretty sure all the above must sound really, really obvious, but I thought I'd spell it out because there just seem to be so many people who don't keep learning, and over time just put themselves at a disadvantage. Nutters.
Permalink. Posted 00:15, Sat, 20 Apr 2013.
The flaw in LinkedIn's endorsement system
Steve Jobs said that A players hire A players, B players hire C players. I have a suspicion that something not dissimilar happens with LinkedIn endorsements. Who would you endorse as having skills you feel you are strong at? Personally, I'd only endorse people who are really good at that thing - the endorsements are a reflection on my judgement, after all. So, generally, I end up only endorsing those better at the thing than I am. To endorse those weaker than yourself is either to say that you're incredibly brilliant, or have low standards.
However, such endorsements aren't really helpful, either. If you're looking for a top X, why would you trust the judgement of a relatively weak X in comparison? I guess endorsement allows the creation of self-regarding clusters of Xers who all endorse each other, but I'm not sure that's helpful either.
It's a bit of a difficult system to avoid, too. You see messages like 'Do you want to endorse Simon Peyton Jones's Haskell skills?', 'Do you want to endorse the Pope's Catholicism skills?' and 'Do you want to endorse this bear's arboreal defecation skills?', and... well, it's hard not to (NB: I've only actually seen one of those). So, top people get stupidly obvious recommendations from people that don't really matter.
About the only useful case I can see is something like a 'recommend a plumber' - you see a few recommendations like 'I'm an Xer, I needed a Yer as I know nothing about Y. Fred did all the Ying I needed, and now I'm happy.' Useful for consultants to sell their customer-facing skills, but not much else.
Permalink. Posted 08:28, Thu, 11 Apr 2013.
More ZX Spectrum Cross-stitch!
I couldn't resist doing a bit more ZX Spectrum cross-stitch. These ones are a bit less colourful, but still make me happy with retroness. For the less well-informed, these are a still from the Manic Miner "game over" animation, and the message that ZX Basic gave you upon successful completion of a command.
These bookmarks were late Christmas presents for a couple of friends, so they're finally unembargoed. Hopefully this has got the cross-stitch out of my system for a while!
Permalink. Posted 22:27, Wed, 06 Feb 2013.
I have destroyed a Swiss watch
I've killed a 17 jewel mechanical Swiss watch. All in a good cause, though. I've been intrigued by mechanical watches for some time, so I finally got an online course in how to disassemble and reassemble a watch. It's really the very basic stuff, but you have to start somewhere. I started with a super-cheap clunker off ebay, and I'm glad that I did, since I promptly destroyed it. To my surprise, I didn't total the balance, but instead broke the head off the crown wheel. Yes, I read that it turns the other way, but then I went and got it confused with the ratchet wheel. Oh, and I broke a little bit off part of the keyless works. Ho hum. I have learnt a lot, though.
I've taken a photo of it in its disassembled state, along with a few of the cool new toys^Wtools I've needed to get it into this state. One of the things I've learnt from these tools is how precise movements human beings can make - working manually down to a scale of 0.1mm is quite doable, and it's astonishing what you can see with a loupe and even just a little practice...
The movement in question is an FHF 28, which has the extra fun of an extra layer due to its way of doing centre seconds. I notice in the photo they have that their keyless works broke in the same way as mine. I think it must be a design flaw!
Anyway, at this point, I have disassembled it all into myriad parts, but I've yet to reassemble it. I do hope I haven't forgotten everything...
Permalink. Posted 22:54, Mon, 04 Feb 2013.
The Circle of Fifths part 2 - Scales
When I rambled on about the circle of fifths, one of the things baked into it was the way there are twelve semi-tones in an octave. Various web pages discuss the golden ratio as an explanation for the number of notes, on the grounds that if you include the octave note there are 13 semi-tones, 8 white notes, and then 5 black notes. Hmmm.
So, what's the real thing that controls the number of notes in an octave? It basically comes down to having a note that's very close to a fifth. That is, given even-tempering, so that each semi-tone is the same ratio above the previous note, there is a note very close to 1.5x the base note. As it happens, with a twelve note scale, 2^(7/12) = 1.4983.
Put another way, what we're looking for is good rational approximations to log_2 1.5. And a really useful tool here is ... continued fractions! Continued fractions rock, and I'm rather disappointed they never appeared in the first year of undergrad maths I did. In the end, I learnt about them as a side-effect of learning some number theory for Project Euler.
Continued fractions provide a nice way of created truncated rational approximations to irrational numbers. So, log_2 1.5 is 1/(1 + (1/(1 + 1/(2 + 1/(2 + 1/(3 + ...)))))). According to the Wikipedia page algorithm, the approximations in increasing order of accuracy are 1, 1/2, 2/3, 3/5, 4/7, 7/12, 17/29, 24/41.
So, a 12-note scale is as good as it gets, unless you want a 29-note or 41-note scale. And that sounds excessively complicated.
Where did this Fibonacci/golden ratio business come from? Well, phi = 0.61803..., and log_2 1.5 = 0.58496..., so if you're looking for mystical numbers around 0.6, you might assume the golden ratio's involved. It comes up everywhere else, and 7/12 makes you think of 8/13. A case of mistaken identity!
Permalink. Posted 22:09, Mon, 04 Feb 2013.
Miranda and the Whale
Bit of an odd one here, so bear with me.
Miranda, our just-under-two daughter, is fearless. She clambers all over the furniture, and when she falls off she doesn't cry. She looks very thoughtful, and then tries to do the same thing again, until she can do it without hurting herself. The upshot of this is that she's very good at moving around, far better than David at the same age.
Apparently unrelated is the "London Whale", which lost several billion for the CIO at JP Morgan. They traded illiquid, exotic credit products. By no means the key feature, but one of the things you always see in exotics blow-ups, was the dodgy marks.
To mark a trader's book to market price, you need the prices of various things. If the things are illiquid and trade irregularly, this is more difficult than it sounds. Normally, these illiquid price marks are supplied by the traders, since they're active in the market and know what the current situation is. Some sanity-checking is supplied by risk management groups, but that's about it.
So, unsurprisingly, when things are going badly, there's a tendency of traders to set their marks to avoid showing a loss. This can cause further problems, since the marks they use to price their positions are also used to price new trades - if you've got a lot of X, and are mismarking the price high to make your position look better, you'll be saying 'ooh, everyone else is selling X cheaply' - if you can't keep up the doublthink, you'll just make things worse.
The thin, theoretical, connecting idea is 'take the pain early'. If something's going badly, covering it up or ignoring it will just make things worse. If you take the hit early, you can do it in a more controlled manner, learn from it effectively, avoid having waste time papering over the problem, and get it right next time.
This is, anyway, my view on software development (although I think I've said this several times now, in different forms). Sometimes, due to time constraints or whatever, you can't get it all right from the start. Finding the techniques to make sure that the stuff that needs to get done gets done regardless is an important skill.
On the other hand... pretty much everyone senior in the Eurozone seems to believe that trying to resolve problems is somewhat dangerous, and kicking the can down the road for years on end is the best solution. So, what do I know?
Permalink. Posted 21:00, Mon, 04 Feb 2013.
Analysing the musical 'circle of fifths'
I'll start with a detour, but bear with me.
It's fun to read children's stories to children. David is just starting to read, so he's starting to learn the concept of rhyming. This means I sometimes explain the rhymes in the story. It then comes back to me how much of the enjoyable flow of these stories is simply rhythm and rhyme, which in turn are complex terms for 'has the right number of syllables in each line' and 'ends with the same sound'.
Of course, this is all the same stuff I did at GCSE, but I never really cared for poetry. When it's a fun story for small children, I haven't bothered to look under the hood at how it's all put together. When I do so, it kind of takes the joy out of it!
Similarly, I have the score for Beethoven's 9th kicking around somewhere. It's amazing how something so magical can, on the page, boil down to what is an awful lot of repetetive, boilerplate-like structure. Working within that framework, there's still enough freedom to create pieces that are fantastic or dross, but having the framework revealed is very odd.
I find the structure of classical music particularly odd, since some things are driven by convention, and some by more fundamental concerns, and it's sometimes a bit unclear which is which. For example, particular intervals sound good because they're very close to simple frequency ratios. At the other end of the spectrum, there are standard forms, like the sonata, which are arbitrary, but provide a framework to build in, and listen in. Inbetween, you have things like the number of notes in an octave, which is semi-arbitrary - the precise number of notes is just something we've learnt to get used to, but the choices are constrained by choosing a number of notes so that our scale can include those simple ratio intervals.
So, in classical music (especially harmony) there's quite a lot of things to learn that 'just are', and it's difficult to tell if they sound good because we're used to it, or because they're fundamental. One thing you learn is the 'circle of fifths' - C major has no sharps or flats, and if you repeatedly go up a fifth you add a black note each time, passing through G major, D major, A major etc. When you run out of black sharp notes the other notes get sharpened, until you work through all the scales, and finally your adding of sharps turns out to be the removal of flats, and you end up at C major again.
Is this something that only works for a conventional major scale, or would it work with whatever interval structure we chose?
Let's start with why we cover all the possible starting notes by repeatedly moving up a fifth. There are 12 tones - an octave has eight 'white notes', including the repeated note, so that's seven notes, plus the five 'black notes'. A fifth is actually an increment of seven semi-tones. Seven and twelve are co-prime, so repeatedly adding a fifth (as addition of seven modulo twelve) will cover all the values. So, depending on the interval chosen and the number of notes we fit in our 'octave', this coverage may or may not happen.
Next, why does moving up a fifth add a sharp? Well, the intervals for a major scale are 'T T S T T T S', where 'T' is a tone (two semitones) and 'S' is a semi-tone. If we rotate the sequence along four elements, to start on the fifth, we get 'T T S T T S T'. The change between playing C major notes starting on G and playing G major is in the final two steps - whether we do a tone followed by a semitone, or vice versa, and that's basically whether the last note is sharpened or not.
What other schemes would work like this? Well (assuming even tempering, so that all 'semi-tone' equivalents are the same size), if your 'scale' consists of a set of 'tones' and 'semitones', the requirement is that there is a rotation of the scale that differs from the original scale by swapping around a neighbouring pair of 'S' and 'T'. For a scheme like the major scale, with a scale of 5 tones and 2 semitones, I think the only scheme where this works is a rotation of the major intervals (which is why the normal circle of fifths doesn't work for minor keys). For scales constructed differently, there are other possibilities.
Finally, why, when we start on C major, do we sharpen the notes whose sharps are black notes first? This is simply because our cycle of scales moves up a fifth each time, so that we're doing an 'almost half octave' step each time. The black notes are in two groups (of size two and three), and we alternate sharpening a note from each group sequentially.
Overall, what does this tell us? We could construct a very similar system with, e.g. decades and a 'circle of sixths', with a group of three black notes then four black notes. Alternatively, by tweaking the intervals used, you could get very different systems where, for example, different base notes share exactly the same notes in their scales. This probably makes modulation somewhat interesting.
At the end, I'm once again left wondering what is really fundamental to music, and what's arbitrary (but fascinating) structure.
Permalink. Posted 22:39, Thu, 13 Dec 2012.
A ZX Spectrum tape-loading themed scarf!
Enjoying my recent success in crafty things, I had a go at creating something I'd wanted for quite some time: A ZX Spectrum taped-loading-themed scarf! A very rough introduction is that the tape loading was accompanied by horizontal coloured bars in the border area: it consisted of even red/cyan bars (accompanied by a constant tone), followed by uneven yellow/blue bars, representing the data itself. So, I wanted a scarf with red/cyan stripes at the end, and yellow/blue in the middle.
I'd wanted one for a while, but didn't think exactly anyone would want to make one, certainly not to my odd and exacting standards! After doing the cross-stitch business, I realised I could do it. I'd never knitted before (or since), and I only learnt how to knit - still no idea about purling! However, this was enough to get me going.
So, the gear: I opted for 6mm bamboo knitting needles. Apparently wood is good for beginners, as it's not too slippy, so less chance of it all falling off the needles (a bad thing). 6mm is reasonably chunky, but that goes with chunky wool, if you want to do chunky knitting! Chunky knitting means less knitting to cover a certain area, and given my impatience, that seemed like a good thing. For a scarf, chunky knitting looks pretty ok.
Then, I bought wool. This was a surprise - the colour selection is rubbish. Even going to a big online store, it's pretty limited. I wanted really bold primary colours, and... the selection is poor. I ended up with 50g balls of 'Bergere de France Magic+' in Brique, Estuaire, Seneve and Gynura. This is a 'chunky' wool, and handy interweb scarf calculators told me that 4 balls would be about right for the scarf I was looking at. I should have perhaps got a bit more, but at around a fiver a ball, I thought it'd be overkill.
So, onto the knitting itself. Each stripe was a multiple of two rows (across and back), so that the colour-changing always happened on the same side, simplifying my life. I was quite surprised how easy it was to get a reasonable pace with the knitting - nothing insanely fast, but it doesn't take much practice to get going. Of course, in places I have uneven tension or lost or gained stitches and other weirdness, but reasonable uniformity came pretty quickly.
Within the tolerances of the pattern, I tried to get the relative band widths correct, using the timings from The Spectrum ROM Disassembly. This book also provided me with the details of the encoding and colour pattern. So, I knitted the lead tone, the sync bit, the data block header byte (0xFF, I think), 'SF' in ASCII, and then the parity byte. Unfortunately, at this point I ran out of wool (I had plenty of red/cyan, but not yellow/blue). So, I finished off with another red/cyan lead tone. (One thing I hadn't really realised before is that you can only get one side having the exact pattern you want - the other side switches the lines about a bit - you can see this in the picture with David).
In effect, the scarf contains data as if the end of the data block were overwritten by another lead tone, just like the kind of stupid mistakes I used to make when recording on the Spectrum! If you tried to load this scarf, the data would load successfully, but you'd get an 'R Tape loading error'!
How happy am I with the result? My itch has been scratched. The quality's not perfect but it's good enough, the length is just a little short (but I couldn't face spending 30 pounds just on wool!), yet it covers my neck, it has pedantic attention to detail on the technical side, I learnt to knit, and it amused my work colleagues. Can't ask for more than that, really...
Permalink. Posted 23:16, Sat, 17 Nov 2012.
8-bit cross-stitch: Head Over Heels
Buoyed up by my 'success' with the CPU display, I decided to take on another project I'd been thinking about for some time: 8-bit pixel art cross-stitch. It's always struck me that cross-stitch is basically pixel art, except it was pretty much unaware of it. And there's some rather good, or at least nostalgic pixel art from the '80s and '90s that deserves resurrection. And cross-stitch can't be that hard, can it? I mean, it's all girly and everything.
So, I cross-stitched up the main characters from the ZX Spectrum classic Head Over Heels, and you can see the results thereof. I think it's turned out rather well! On the details side, I used 14 count white Aida material, with size 24 cross stitch needles. I used 3 strands of DMC stranded cotton, colours 310 (black), 996 (for cyan) and 307 (for yellow). The design was taken from a screenshot, with the image fed through www.myphotostitch.com to convert it into cross-stitch instructions.
It turns out that cross-stitch is quite straightforward, but rather slow and tedious. I'm glad the image wasn't too large! I was originally considering making it double-size (4 squares per pixel), but decided not to as it was my first go. I'm very glad I didn't. I still have to mount the result, which will be a whole new pile of hassle, I'm sure.
After doing this work, I discovered spritestitch.com, which does lots of this kind of stuff. I didn't sign up. I mean, they've all got to be weirdos, right? ;)
Permalink. Posted 16:40, Sun, 30 Sep 2012.
Minor successes in geeky craftwork: RISC CPU display
Hennessy and Patterson's Computer Architecture: A Quantitative Approach was the first really good textbook I'd read. It was a proper advanced undergrad textbook. It covered things from first principles, with detailed, highly-readable explanations that went on to advanced topics. The references to the literature both demonstrated how this tome of most of a thousand pages was but an introduction to the field, but also showed how to get to the cutting edge of the field.
And one of the great things about it was its real quantitative approach. Design trade-offs were illustrated through benchmarks, applications of Ahmdahl's Law, etc. The excitement of the RISC revolution was palpable in the text. At the time, if you wanted a high-end workstation, you didn't get an x86 box, you got a computer with a RISC CPU. Five main architectures were used as illustrations throughout the book - the Alpha, PA-RISC, MIPS, Sparc and PowerPC. Now, only Sparc is holding on by its fingernails in the server market. We're in a strange world where x86 is still king.

When decorating my study, I wanted to produced something which harked back to the time when it briefly seemed that logic might beat marketing momentum. A little display of examples of each of the architectures (with chips taken from decently spec'd machines - no getting the MIPS out of a washing machine!). This was a few years ago. A bit of ebaying got me a selection of chips - along with some rather funky heat sinks. However, with children, I never found the time to assemble the display.
Finally, though, I have. Not particularly impressively - it's rather half-done, but as a parent, this is small victory for me! Ideally I'll find time to build it into something more case-like, but I'm surprisingly proud of myself!
Permalink. Posted 23:15, Sun, 26 Aug 2012.
Online courses
I've been taking some online courses through Coursera, a co-operation between Stanford and some other places. I originally signed up for 'Model thinking' (a rag bag of quantitative and semi-quantitative models for social science), 'Game theory' and 'Information theory'. The latter shows one particular problem: The course never happened. The other two merely had delayed starts. The other problem is that the courses tend to be very much introductory-level. They assume nothing, and tend to let mathematical detail slide. Don't pretend for one moment that you're attending Stanford. I guess they're pretty accessible, though.
The content is generally pretty good, there are review quizzes and the like, and one of my favourite features is that you can watch the course video at up to two times normal speed. It turns out I can listen a lot faster than people like to speak, especially if I can pause the video for a good think should something subtle or clever appear.
The 'Model thinking' course was mathematically very simple, but it opened up a new way of thinking to me. I'm used to the physicits 'Put it all together into a unifying framework' approach, with models that can be complex, but are (mostly) testable. Social science lacks a grand unifying theory, but instead you can make do with very simple models to get a qualitative feel for behaviour. When you have a problem, throw as many different models as you can at it, and see how they agree.
'Game theory', as an introductory course, didn't feel like it taught me much I didn't more-or-less know already, given I'd read a not-very-good text on the subject a few years ago. On the other hand, it pressed the knowledge home, through explanation on videa, plus a variety of quizzes. It demonstrates the strengths and weaknesses compared to, say, just buying yourself a big textbook and reading it. The depth won't be as great and/or the pace will be slower, but it seems to me to be a much more effective way of jamming knowledge into your head, even if there isn't personal feedback from a tutor.
I'm now working through 'Machine learning'. My view of AI is... not wholly positive, and I'd kinda got the feeling of this kind of stuff as people farting around and not really getting anywhere. However, in the last decade, it's all started to click. I mean, Google's doing translation using machine learning and statistics, cheapo cameras have face recognition built in, and all this kind of stuff. Huge data sets and huge computing power, plus new research have made all kinds of stuff possible.
'Machine learning' isn't trying to screw around passing the Turing test, either. What I realised when looking at this course is that it's basically fancy forms of regression and classification. Indeed, the course starts off with linear regression! In a sense, it's all super-fancy curve fitting. There are programming exercises, including cool stuff like using neural networks for OCR.
As I said, though, these courses tend to be rather introductory level. The pacing of the course and exercises means that you've got a better chance of absorbing the basics than just picking up a big textbook, but I don't think it really suffices. So, I have bought a big book, namely Bishop's Pattern Recognition and Machine Learning. I'll see how I get on with that.
And there are plenty of courses I'd still like to do. I'm having to ration myself to prevent either me or Caroline from going mad, but it'd be nice to do the machine vision/image processing course...
Permalink. Posted 00:00, Tue, 31 Jul 2012.
Mac-fixing Encore
To follow up on the MBA fix, here's what I then pulled out of the DVD drive of the Mac Mini that sits by our TV. Ah, children.
Permalink. Posted 23:19, Mon, 16 Jul 2012.
Fixing a Macbook Air
Aaages ago, my Macbook Air broke. The hinge snapped, doing nasty things to the display, and the official Mac repair route would have been so pricy that it didn't quite seem worth it. I bought another MBA second-hand, and my original came in handy for spares when the disk in the second MBA broken (the disk in my first MBA was already a replacement. They don't seem terribly reliable to me).
I finally got around to replacing the broken hardware. Getting an Official Mac Guy to fix it would have cost too much (especially given it's an original 2008 MBA, and thus nicely out of date). Using official Apple parts would have been too much. Even dodgy OEM parts where expensive enough.
What actually happened is the metal hinge 'clutch' broke in two. The pieces spronged, and ground against nearby bits of case to break the plastic antenna cover. Also, because the hinges were no longer lining up properly, moving the display caused the hinge assembly to eat into the LVDS cable, until the screen display stopped working.
So, new hinge clutch, antennta cover and LVDS cable. Oh, and a replacement disk, for the one I scavenged earlier. I settled on an SSD this time. Maybe it would be more reliable. I had to fully disassemble the MBA, including ungluing bits of the display assembly. It now works (better than the other one, thanks to the SSD), even if the display bezel looks a bit dodgy.
I feel I should put up a picture of all the broken parts (apart from the disk - that just looks like a working disk), so here you go. You can see the hinge clutch in two parts, the antenna cover with a bit broken off, and the LVDS cable with a bit of a cut in it. All very exciting.
Permalink. Posted 22:17, Sun, 15 Jul 2012.
Paper round: Distributed Systems
I recently realised, to my horror, that I really haven't been keeping up with a fair old amount of research. So, I decided to catch up on 'distributed systems'.
Back when I was an undergrad, we had a course on 'concurrent systems', which dealt partly with locking and race conditions and stuff, and then did a little bit of the distributed end of things. When I finally discovered that expensive 'middleware' was mostly fancy networking wrappers, I was remarkably disappointed and underwhelmed. Making a pile of networked PCs work together sounded like the boring end of research.
Since then, distributed systems have become Key To The Web. Real practical uses, with real practical requirements, instead of half-hearted academic systems. Things have come a long way, and I just haven't kept up. It's rather ironic, really, since my first job out of uni was working on compute farm infrastructure. This managed a queue of short-run compute jobs, and associated data distribution, so it's nothing to write home about algorithm-wise, but it taught me a lot about middleware, failure modes and performance. And then, I moved on.
I didn't track the research, which is a bit embarassing given all these famous papers by Google, etc. I am finally catching up on the area. It's not exactly practical to set up a grid at home, but reading the theory is better than nothing...
Here are my paper mini-reviews. If I were still acting like a proper researcher I would provide a proper bibliography, but these papers are well-known enough that paper titles suffice.
The PageRank Citation Ranking: Bringing Order to the Web Ok, it's not a distributed systems paper, but given it's what effectively kicked off Google, it seemed worth reading. I'd never actually read the details of the PageRank algorithm, but it's both simple and clever. Elegant, in other words. The paper is well-written and clued up, and having read this paper I don't begrudge Brin and Page becoming billionaires! It also reminded me how bad web search had got by that point. The example web search for 'university' had a 'major commercial search engine' producing more-or-less junk, whereas PageRank produced something basically sane.
The Anatomy of a Large-Scale Hypertextual Web Search Engine 'Large-Scale' is severely relative. What was large back then is tiny now. This paper is pretty much a partner to the PageRank paper, focussing on system architecture, both in terms of how the algorithms are glued together, and how they are parallelised. This early-days paper is perhaps not terribly relevant now, but is certainly interesting (if only from a historical perspective :p ).
Web search for a planet: The Google cluster architecture A not terribly informative introduction that tells you that web search is parallelisable, and that best bang-per-buck is achieved with mid-range commodity hardware. Hmmm. Well, maybe it's not that bad, but it's not an earth-shattering paper.
The Google File System A very readable paper, describing Google's big, distributed file system (used to manage hundreds of terabytes across thousands of machines). Demonstrates a nicely simple design - don't bother with full Posix, and everything is entirely user-space (building the server on top of the native file system). Funny consistency model, neat operation log.
The Chubby lock server for loosely-coupled distributed systems After the GFS paper, quite a disappointment. Chubby provides a smaller, simpler file system with locks on the files. It's mostly used to hold configuration for other systems, provide locks for other systems, and provide a mechanism for other distributed systems to elect masters. Bits of it feel like a nameserver with subscription/non-timeout-based caching. This all-in-one behaviour, coupled with a lack of details on common usage patterns made it feel like a bit of a wishy-washy design to me. What problem is it really trying to solve? Yuck, yuck, yuck.
Bigtable: A distibuted storage system for structured data This is really just a key-value-ish database built on top of GFS. It feels like another simple design, done well, explained well. The immutable design is rather neat.
MapReduce: Simplified data processing on large clusters Another popular yet underwhelming paper. The idea is simultaneously neat and straightforward: perform your grid calculations in a stylised map and aggregate format, and use that structure to make the gridded implementation simple, efficient, etc. The 'reduce' part is basically a monoid, and I think that whole area could be cleaned up by talking about it in that way. But maybe I'm just a Haskell nutter.
Dynamo: Amazon's highly available key-value store The paper doesn't start off well. It reads like an advert for just how darned brilliant Amazon are. As you read through the paper, either the advertising falls away, or you become immunised to it. What's left is a rather good paper. Dynamo is highly optimised for uniform good performance, and the paper goes into a lot of detail about the engineering. There's a lot more focus on the algorithms than in the Google papers, and it feels like they've built a complex system on top of pre-existing research, in a way that the Google descriptions don't. It's well-written and introduced me to various algorithms and ideas I hadn't really looked into before.
MAD Skills: New Analysis Practices for Big Data Randomly recommended by Wikipedia as an influence on modern Big Data stuff, this paper tries to sell the idea of agile practices for Big Data, rather than creating a fixed schema data warehouse and making it difficut to pile new data in. The start of the paper felt like it was trying too hard (UrbanDictionary.com reference, anyone?), but it soon settled down, into a spirited promotion of an agile approach (at the expense of a somewhat straw man waterfall approach, it seems), plus an explanation of how to do linear algebra and machine learning in SQL. I'm still somewhat in two minds about SQL for Big Data, but there you go...
Apache Hadoop Goes Realtime at Facebook Another Wikipedia recommendation, and the first one to really dig into using Hadoop. A fairly neat introduction into what it takes to pick up Hadoop and use it for a production-scale and production-reliability real-time system. An interesting combination of showing that such open source software in this area can be used for the big time, and what customisations Facebook needed to make. Technically, compared to most of the above papers, it's not much, but in effect it's showing a flattening in the technology curve - you can now get the kind of software previously only described in research papers off the shelf.
So, I've read enough papers to keep me going for a while. On the other hand, it's shown me how much more stuff I should really read up on. Cool things that I read about that I want to read more about include:
- Paxos and Byzantine fault tolerance, etc.
- Gossip protocols and anti-entropy protocols
- Bentley-McIlroy compression
- Turbo codes, erasure codes
- Consistent hashing
- Merkle trees
- Log-structured merge trees
Plenty to go!
BONUS PAPER: "Ulrich Drepper's How To Write Shared Libraries" One of our work projects is a library that works on both Windows and Linux. Once I'd discovered that Linux shared object files are dynamically linked purely by symbol names, not by library name/symbol name pair, I couldn't believe it. Ok, it gives you LD_PRELOAD, but leads to so many other possible brokennesses. I went and read a paper on it:
This (long) paper goes into excruciating depth about how it all works, and all the pain involved. It's very readable and comprehensive, and tells you all the things not to do, plus various micro-optimisations suitable for making a really efficient libc. It covers API and ABI compatibility, and is generally a good read, even if you can tell from time to time that English isn't his first language.
Permalink. Posted 07:36, Wed, 11 Jul 2012.
Apple's Time Machine
Apple's Time Machine is an interesting piece of technology. The GUI front end is friendly, simplistic, and brittle. I used it to back up my Macbook, but once the disk died, I couldn't use the GUI to access the network back-ups of the lost machine!
I think this is partly as I had to tweak it originally a bit to get it to back-up to something that wasn't a particularly Apple-supported network share, or perhaps it was because the Mac I was trying to access it was used a different version of MacOS.
Anyway, I was now somewhat frustrated - perhaps vexed even, as I couldn't access my back-ups. However, behind the scenes, all it is is a 'sparsebundle' disk image, containing a list of dated directories containing the entire filesystem as of that time. Very straightforward to access once you're in!
It wasn't quite that straightforward in practice - there were random error messages until it got itself into a sensible state, but still... not bad!
The technology that makes this work is rather nice. The collection of directories share as much data as possible with minimal overhead by having hard links on directories! And to decide what needs to be backed up, the filesystem tracks directories modified since the last back-up. The transparency and simplicity of the back-up system, within the constraints of it being efficient, are rather neat. It does feel (to me, at least) that while it pushes the technology forward, it stays true to the Unix approach.
And then they put a glitzy but brittle GUI on top. :)
Permalink. Posted 21:36, Thu, 10 May 2012.
Ignoring non-sensical solutions
Warning: Contains simple puzzle spoilers!
On and off I've been working through Rae's Quantum Mechanics. One of the things that annoys me is the way the solutions to problems are chosen. Take a particle in a potential well: A differential equation is given, solved, physically non-sensical solutions are discarded, and what remains must be the solution.
How on earth can we justify only considering the 'sensible' solution? Can we really claim we have a mathematical model if it involves such judgement calls?
It reminds me of a simple mathematical puzzle. We have a cell, which, every time step, dies with probability 1/3, or otherwise splits into two cells, each of which follow the same rule in future timesteps. The children behave independently. What is the probability it survives forever?
If x is the probability it doesn't survive forever, this is because it either dies in the next step, or both children must die - as they are independent, the probability of this is x^2. So, x = 1/3 + 2/3 x^2. Solving this gives x = 1 or 0.5. Ignoring x = 1, we have the answer x = 0.5. But... why can we ignore the incorrect solution of certain death?
One way to look at it is as a convergence issue. If you plug in numbers around 0.5 into 1/3 + 2/3 x^2, you get a number closer to 0.5. If you plug in a number just under 1, it'll end up further away from 1. This shows that 0.5 is a 'stable' answer, but 1 isn't.
However, this still isn't convincing. What you can do is define x_i as the probability of the colony dying out in i steps, with x_{i+1} = 1/3 + 2/3 x_i^2. We can then say the colony death probability is the limit as i goes to infinity, and show that the limit is 0.5, if x_0 is in the range [0,1), thanks to the above convergence properties, and solving the quadratic is simply a way of finding the converged value for the real solution.
In a similar fashion, I believe the differential equation isn't the real quantum problem. It's an equation that represents possible equilibrium solutions, but misses out how to select the real one from all the possibilities it brings. Somehow, there is a mathematical phrasing for the real problem we're trying to solve, one that eliminates the 'non-sensical' solutions automatically, but I'm not sure what it is.
Permalink. Posted 21:44, Sat, 24 Mar 2012.
On Simplicity
Again and again I see designs that are just... wrong. Given the choice between doing it right and doing it simple, the choice is 'simple'. I'm not the only one to notice this.
Unix is a traditional example. It's as simple as possible. Look at the file API. Let's say that you want to read some data. You can read(2) it. If we're streaming the whole file in, a bit of prefetching would be nice. If we're trying to read regularly-sized records from all over a file, we want different behaviour. The API ignores the notion of record sizes and advance knowledge of access patterns. Any optimisation has to be heuristic.
Perhaps worse is atomic file update. Also known as: Write to new file, hard-link old file to back-up file, move new file over old file. Moreover, there's magic in the semantics of the file system to make this work: the writes to the file should be committed to disk before the move is, so that you don't get a partially-written file in the case of a crash. Ugly hacks to do a conceptually simple operation.
Or look at TCP/IP. The internet's favourite way of distributing video. There's no quality of service built into basic TCP/IP, and a packet-switched network is just the wrong solution for that kind of problem.
Way back, PalmOS was another example (like MacOS before it) - a co-operatively multitasking OS. Simple to design and implement, great for lightweight hardware, able to beat its pre-emptively multitasked competitors, but you know that in the end it'll be running on more powerful hardware doing more complex things, where the original design looks horrendously underpowered.
This is really the key. A complex up-front design will be bad, since you don't understand all the corner cases, people don't want all the infrastructure yet, the current hardware can't really support it, and you'll be trounced in the marketplace by a simple design that works right now.
Real good design is one that can evolve. Unix started off life as an OS for an impressively under-spec'd machine. Amongst other limitations, one thread per process, no real-time guarantees, uniprocessor model, certainly no graphics or networking. Even when Unix had been extended, Linux started as an x86-specific (and uni-processor) monolithic kernel Unix clone. Somehow, the design has extended in a way that has not fundamentally compromised it. Sure, it's warty, but it's somehow scaled. TCP/IP's not dissimilar. Despite its grimness, video over TCP/IP is working out mostly ok.
Sometimes, even ignoring the evolution over time, doing the simple wrong thing is better than the right thing. When I heard about Erlang's error model (basically, 'if something goes wrong, explode, and have a hierarchy of processors that can deal with children exploding rather than attempting to recover), I was horrified. However, it sounds like the right thing, in the end. Error handling is tricky to write, tricky to test, and errors in error handling could make for a nice cascading failure. Playing it dumb may make it more safe and reliable than trying to think hard.
Einstein's maxim to make it as simple as possible, but no simpler, seems to have some relevance. The lesson is, make it simpler than you think 'possible' would be - you might be surprised. On the other hand, make it extensible, so that success and scope-creep doesn't induce long-term failure.
Permalink. Posted 22:19, Mon, 27 Feb 2012.
Differential 1-forms, again
A while ago, I tried to make sense of differential 1-forms by thinking of how they're used in terms of a type system. I got approximately nowhere fast (you can see the post!). I've recently been reading Penrose's Road to Reality which describes them in near-laymans terms, and this seemed a good point to try again...
He defines a vector field as a function which takes another function (from point in the manifold to real value), and then at each point calculates some weighted sum of partial derivatives, effectively taking the derivative in a particular direction. i.e. a vector field E acts on a scalar function P defined over the manifold - E(P). E is of type (Point -> Real) -> (Point -> Real).
He then defines dP = dx dP/dx + dy dP/dy + ... (you'll have to insert your own curly 'd's as appropriate), where the dx and dy and so on are differential 1-forms. Thus, ahem, dP . E = E(P). For some variant on a dot product which is not really explained.
So, one minute E is a function, then it's something you can dot product. This is probably in some vector space of functions, but the whole thing seems pointlessly messy to me.
Instead, I decided to build my own mental model, and assume that whatever the books have is just waffle disguising it.
Taking the derivative of a function is basically identifying the linear transform which it's locally like. In Euclidean space, taking the derivative of an R^n -> R^m gives you the information about a R^n -> R^m linear transform. However, for a manifold, the space you view the derivative in (a vector space) may be rather different from the space you started in (something curvy). This space for derivatives is the tangent space. The tangent space for the reals is the reals, so if you have a manifold M with a tangent space T, taking the derivative goes from M -> R to T -> R.
Thanks to duals, this derivative can also be represented as a member of T. We can then view the dx, dy etc. as basis elements of the tangent space, and dP as an element of the tangent space representing the derivative.
The tangent space allows you to tack directions onto each point. In other words, a vector field can be viewed as supplying an element of T for each point in M. This gives us a way of representing E. Happily enough, dP . E gives us the thing we were after.
Overall, I think I prefer defining dP as a function of type T -> R. That way, the covariant scaling effects are made clearer, and you can probably get away without a dot product (haven't thought hard about that second part). If you change the scaling on your parameterisation of the manifold, when your vector field entries increase in measured length, the derivatives (as a function from the tangent space to reals) also decreases proportionally, so that the final number you get stays the same.
I'm sure I've swept a lot of clever and subtle issues under the carpet here. However, I feel the textbooks' approach does too. Until I can get a handle on an alternative well-typed interpretation of these things, this will do nicely for me.
Update: I underestimated Penrose. In a later chapter he clarifies his handwavey notation. I have been using the dual of the normal convention. The vector field assigns each point an entry in its tangent space. Interpreting this vector space as the space of functions taking derivatives in all directions gives us the 'vector field as differential operator' view of the world. Treating it as a vector space, the dual space of functions from the tangent space to the reals is the differential 1-forms, aka covectors. The 'dot product' we see is really just applying one to the other, and we end up with the contravariance between the two as needed.
Penrose's rather nice geometric intuition of this is that vectors work as you expect - their length is as you understand, but covectors represent density. 'dx' can be thought of as the hyperplane of points perpendicular to the direction x, and represents a density of stuff in that direction. I'm still working through this chapter, but I expect this will fit quite nicely with integration...
Permalink. Posted 21:50, Mon, 13 Feb 2012.
Sony and Abstraction Failure
Being nasty to Sony is a bit like shooting fish in a barrel, but it won't stop me doing it.
I saw a Sony iPod dock in the shops the other day. Gosh, how the mighty have fallen. Sony admitting the existence of another company? I had to check that the dock wasn't only compatible with a special Sony iPod-alike.
The combination of seeing this Sony stuff and having played with my ickle iPhone for a few weeks made me really think about the differences between Sony and Apple. They both produce beautiful looking products. I'm generally a fan of Sony hardware. The firmware is mediocre at best. Their software is awful (Remember when they tried to install rootkits on customers? Pretty par for the course, really). Their networks have been hacked and millions of customers' details lost.
Fundamentally, Sony can't do abstraction - the more high-level something is, the worse it is. The WalkMan worked because it was largely mechanical, and lacked any embedded CPUs. I suggest Sony move into brick-making.
Permalink. Posted 22:43, Thu, 17 Nov 2011.
Good Stuff: Raspberry Pi and the iPhone
Looking at this blog, it seems I mostly complain about stuff. In real life, I'm not that bad (honest), but this is where I end up writing down various frustrations, and it's relatively rare that I actually want to point out good things. I thought I'd make an exception to my normal posting style...
Raspberry Pi is a fantastic project, aimed at getting children back into programming - the '80s BASIC-powered home computer for the new century. While there are a lot of people involved who I really respect and trust (including my PhD supervisor, and an ex-employer), it's really the brainchild of Eben Upton, who is a really great clever guy. I can't really say enough good stuff about it, so I'll give up now, and instead encourage you to go scour the web for more information.
From the very open to the very closed. I have finally bought myself an iPhone 3GS, after years of having the dumbest of phones. In the past, what I wanted was a small phone without all the pointless and expensive extras. Having had years of PDAs, I'd generally found them nice toys, but useless in practice. So, while I could see the use others were getting out of their smartphones, especially in mobile internet, I couldn't be sure it wasn't going to be another pile of wasted electronics.
It's been great. The main thing this gadget has provided, which the others haven't, is mobile internet. It does this well. However, either gadgets have changed, or I have, as I've found the organiser and notepad features very useful! I guess I've got more meetings and to-dos than I ever used to.
Why'd I get the old 3GS? Basically, I enjoy living on the trailing edge of technology. Every upgrade I do has the thrill of discovering the new features, just like everyone else, except the things I upgrade to are bedded down and reliable, and much cheaper than they would have been otherwise!
So, now that I own an iPhone, I can see what Apple are up to. The app store may be horribly closed, but their vision is compelling - they've managed something Microsoft never did, making their customers want a monopoly. Having skipped the painful early stages, I can also see why Apple held off on 'easy' things like multitasking and cut-and-paste for so long. The iPhone was their way of completely redefining the graphical user interface - challenging UI conventions that had basically been standardised on all platforms since Apple introduced them in the mid-'80s.
It's weird to see this 'phone interface', knowing it's also the basis for the iPad and the features are now making their way into OS X in Lion. Rolling out the features incrementally was Apple's way of making the world used to their new plans, and making sure they got it right, without assuming too much of the old ways. The fact that the supplied manual is really one sheet of paper, and the whole design is both so innovative and intuitive is quite amazing.
Permalink. Posted 05:01, Mon, 14 Nov 2011.
Retro-gaming Update: Chimera, AGC and Christminster
A while ago I had some nostalgia for a proper ZX Spectrum isometric game. So, I didn't do anything sensible, like play a classic such as Head Over Heels. No, I got a copy of Chimera, a budget label isometric game I played over 20 years ago. I never got very far with it, but that's probably because I was very young, right? I could beat it now!
As I have limited time nowadays, even with arbitrary save/restore, I felt I didn't have quite the edge to get through it and see how it panned out without a lot of pain, so I found a map and walkthrough. I followed it. It was clear that there would have been a lot of tedious exploration otherwise, creating my own map. It also soon became clear that the game design consisted of traversing a the longest routes the game designer could make, repeatedly. There are no actual moving baddies. Failure at the trial-and-error necessary to work out what does what leads to instant death and the going right back to the start. Fundamentally, the game is a race against the clock, but different events speed up the clock. For example, being in a room with a radiator seems to speed up the water clock something like a hundred-fold, thus totally unbalancing the game. It was so poorly set up that even with the walkthrough and map, and saving and restoring to optimise things, I still somehow managed to end up running out on the timers, and needing to use some pokes to complete the game.
Apparently I hadn't failed to complete the game because I was young. I failed to complete it because it was badly designed and tedious rubbish.
More rewarding have been the text adventures. After playing some commercial titles, I thought I'd try a couple of the more popular amateur efforts. I made my way through the Adventure Consumer's Guide, which was surprisingly fun. You get a sidekick who is vital to many of the problems, but also acts as a subtle hints mechanism. Mostly, however, the hints aren't necessary, since it's a fairly friendly game. As I said, lots of fun, and recommended.
Christminster is a bit of a classic in the 'IF' (interactive fiction) world. I'd tried to get into it a couple of times before, but never really clicked. The start, being a linear and not particularly helpful set of puzzles, had previously blocked me from exploring the game. So, I swallowed my pride and looked at a couple of hints. This allowed me into the main section of the game, and also gave me an insight into the author's mind, useful for solving following puzzles.
While there are plenty of puzzle components, and you're not lead round on rails, there is something of a plot to it, which makes a nice change from wondering around looking for a relatively unmotivated puzzle to solve. I'm now about halfway through (judging by my score), without having used any more hints, although I'm at a bit of an impasse now. Something to plug away at from time to time, I guess.
Permalink. Posted 23:01, Mon, 26 Sep 2011.
Pushchairs and geodesics
Maths abuse alert: All the following is done intuitively, and I've completely failed to formalise it, which is a bit embarassing given I've read a book on differential geometry.
It occurred to me the other day while wheeling the children to Greenwich Park, over the somewhat unever ground, that pushchairs try to follow geodesics. That is, if one of the wheels is taking a longer path, the pushchair will tend to turn away from that side. So, locally, it seeks a shorter path, and in the limit it'll run along geodesics.
Another way of looking it is that the pushchair will run along a path such that each side wheel runs along the same distance. So, along the path the derivative of distance stuff perpendicular to the path is zero, so that said distance stuff is locally maximised or minimised (and I guess we're looking at a minimum).
This in turn helped me understand a Feynmanism. Feynman said that light takes the fastest route between points, but I never really got why this should fit with the other formulations of the behaviour of light. However, I now see this is rather like the case with pushchairs. In refraction, light changes direction as it goes through media with different speeds of light, so that the wavefronts match up across the interfaces (GCSE description of refraction). In other words, it makes the light turn towards the faster medium, and away from the slower medium. And as long as it dos this in the appropriate way (which it does), once again it'll follow a geodesic. Lightbulb appears above my head.
Permalink. Posted 14:04, Sun, 25 Sep 2011.
On the Creation of Sentient Creatures
I watch far too much children's TV nowadays. This is a side effect of having children, I guess. There is a programme called 'Get Squiggling!', feature the character Squiglet, who can draw things and bring them to life, and have adventures with them. This includes drawing people, bringing them to life, and creating new sentient creatures out of nothing with no higher justification required than her own enjoyment. No serious thoughts about the implications, no ethics committee, no nothing.
I mean... I know it's a children's programme, and not meant to be taken deeply, but... surely the philosophical implications are huge?
Could you imagine the real-world consequences of people just being able to create new sentient, conscious beings without any real oversight? What a strange world that'd be...
Then I realised that's precisely the world we live in (and indeed the reason we have children's TV).
Permalink. Posted 13:29, Sun, 11 Sep 2011.
Choosing an Open Source Library
When developing code, you often want to glue in third-party components. If you're writing commercial software, and are pulling in commercial components, it's traditional for the manager deciding to be motivated by Fear or Greed. That is, to either buy from the biggest vendor (No-one Got Fired For Buying IBM), or the one offering the biggest freebies and kickbacks (a nice meal out with the vendors, you say?). Often the two factors point in the same direction, making the choice especially easy. The actual quality of the software can be completely irrelevant - it's basically marketing-driven.
When pulling in open source software, you can be properly objective, right?
So, what do you look at? You download the various alternatives. Read their docs, look at their APIs, build them and prototype with them. Perhaps look at the source to see what the quality's like. If one has a slight technical advantage, you'd probably go with that?
Wrong choice! You actually still want to go for something a little more like the PHB approach. Technical quality and appropriateness is still vital, but so is the community surrounding the software. You could argue that how they deal with bugs and support are parts of the technical quality, but that's missing the point. If you have to choose between two similar packages, don't choose the one with the minor technical edge, choose the one that'll be around in a few years' time!
What determines this? The momentum. Choose the one which is going places, with lots of happy contributing users, ongoing improvements, and fundamentally where the developers understand marketing. You're not using the product because you're a fool who is easily swayed by marketing, but because others are. It's a networking effect, and you want to be using the winning product.
As a clever developer, it's really cool to find an underdog project that's technically better than the obvious choice. It marks you out as part of the cognescenti. This can even be made to work - if it's technically much better, and you can support it yourself if it goes nowhere, or it's an underdog with clear positive momentum, it can be a good choice. Most of the time though, even in open source, the pure technical choice may not be the right one.
Permalink. Posted 12:55, Sun, 11 Sep 2011.
The English Riots: Business as Usual?
I have an amazing ability for waiting until a topic is no longer trendy before actually getting around to writing my thoughts on it. In this case, the riots in the UK. This means that reality turned out to match my thesis, even before I wrote it down; namely, that the riots didn't actually mean anything on a social or political scale. Having started with my conclusions, I shall move back to my reasoning...
I started thinking about how much it really meant when seeing reporting (a lot of it international) about how shocking this rioting was, and how it was revealing something fundamentally rotten in UK society, and should give those running the country pause to think. I even saw some comparisons with events in Egypt etc. (definite high clue quotient there). On the other hand, I'd also seen plenty of people writing it off as mindless thuggery, which is also a bit too convenient and stops any need to try to understand them. What's the actual situation?
First off, I'm ignoring the looting. That's not about the poor or underpriveleged. That's about a group of people where lots of people are clearly breaking the law, getting away with it, and gaining from that. Locally, different social norms start to apply, as following the law puts you at a disadvantage for no 'fair' reason. See, e.g. MPs' (MsP's?) expenses, phone hacking, etc. for other non-disadvantged groups doing similar things.
So, how many people were rioting, ignoring the looting? My opinion would be somewhat more convincing if backed up by real numbers, but it's clearly a tiny fraction of the communities, and looks not entirely dissimilar to the fraction of the population with a propensity to crime and violence anyway. There will also be some people carried along because they're normally only held back by fear of justice, and felt they could get away with it this time.
Put another way, the riots looked to me like people's normal behaviour, only condensed. Indeed, if you look at the amount of crime that was committed, I don't think you need to spread it out over an awfully long period before it blends into the overall crime level. It's certainly lost in the noise on the 30 year scale we seem to have between riots.
In summary, I don't think these riots mean nothing, they just tell us as much about the problems of society as our normal background level of crime and violence. Moreover, it's a real shame to focus on these riots when there have been so many much larger protests motivated by real issues. *sigh*
Permalink. Posted 12:15, Sun, 11 Sep 2011.
Oh, the Horror: Twitter
First, a couple of disclaimers:
- I understand the nature of start-ups. You quickly write rubbish code, because having something running is much more important than anything else. Moreover, the ideas are far more important than the technology details, and if you're successful you'll have plenty of time to rewrite later, right?
- This is Twitter Of Old. I'm pretty certain they've hired some people with actual technological skill and have real scalability now.
So, without further ado, I looked at some old presentations on Twitter's implementation and scalability. MY EYES! THE GOGGLES, THEY DO NOTHING! Let us remember that Twitter is a service so reliable that its most memorable image is the Fail Whale that appears every time it goes down...
So, in a presentation in the late 2000s, post-optimisation, they were handling 600 requests per second on 8 Sun boxes. That's 75 messages per box per second. Er, great. We best not talk about e.g. the performance expected out of IRC servers, a decade earlier.
The real killer for me was this quote:
For us, it's really about scaling horizontally - to that end, Rails and Ruby haven't been stumbling blocks, compared to any other language or framework. The performance boosts associated with language would give us a 10-20% improvement, but thanks to architectural changes that Ruby and Rails happily accommodated, Twitter is 10000% faster than it was in January.
This sounds really cool, until you realise that IN JANUARY, THEY WERE PROCESSING 3 REQUESTS PER SECOND. I've seen shell scripts with better performance, and they're all multi-millionaires now.
Not that I'm bitter. ;)
Permalink. Posted 14:30, Wed, 24 Aug 2011.
Simple Programming Advice: Put Examples in Comments
Just a quick note encapsulating some simple advice I'd only discovered recently: Put examples in your comments. Comments should add something to the code. Explaining what the code should do is good, how is bad, since the code itself should say how.
There's a school of thought that says that comments are dangerous, since they easily get out of sync with the code, becoming dangerous lies. So, don't comment, and make your code readable instead. I certainly have some sympathy with this viewpoint, but I think it's naive. For a sufficiently complicated algorithm, each step may be clear and well-named, but the overall structure and intention may still be mysterious. Very little code is designed to be readable by someone without an awful lot of context, and I think this is a real shame.
So, for comments to add value they should do something other than what the code does. A terse and formal description of what the function should do is well and good, but can sometimes leave the reader confused. By illustrating the same thing in another way, through some examples, you've got a much better chance of having the reader understand what you're up to.
The trendy programmer will point out that you can move the formal description of the function out into a set of pre- and post-conditions, and you can convert your examples into test cases. By doing this, you can ensure that your 'comments' never become out of sync with your code. In theory, yes. In practice, the Test Driven Development book made incredibly heavy weather of developing even the simplest of functionality, and I worry about how it scales up.
One day, maybe. In the meantime, comments explaining 'what' plus examples, please.
Permalink. Posted 12:22, Wed, 24 Aug 2011.
Programming "Experts"
Browsing Amazon's recommendations the other day, I saw "97 Things Every Programmer Should Know: Collective Wisdom from the Experts" by Kevlin Henney. Oooh, I thought, someone's managed to glue together a selection of blog posts (97 tips in one book aren't exactly going to be deep). As Amazon has "Look inside!", I did, and skimmed the table of contents...
I recognised several names. No, it's not a star-studded line-up. They've just contracted at my workplace. This is where the definition of "expert" comes in. They are, indeed, experts at discussing programming. They spend a lot of time at conferences, and networking with each other. They may know some language lawyering tricks. When it came to actual programming (or even design) I saw nothing setting them apart from most other seasoned programmers.
This is unfortunate, as in other domains I've met what I would regard as real experts. People who not only understand a subject backwards, but also have great and incisive ideas. This covers both proper academics and really effective business people. Those guys are a pale shadow of this.
This is not to denigrate the book - I suspect there are plenty of useful, non-controversial pieces of wisdom that deserve to be spread around and some of the contributors may deserve the title "expert". I guess I'm just complaining about my own naivety.
Permalink. Posted 20:01, Sat, 21 May 2011.
On symmetry-breaking, logic, perception and AI
One of the things I was thinking about recently was our perception of symmetries. For example, formal logics tend to have a symmetry between true/false, and/or, etc. Formally true and false aren't particularly distinguished, but they certainly have different meanings to us! This is one of those things that makes me suspicious of using formal logic as an underpinning of AI.
Are there similar isometries for natural languages? Could you switch other important concepts around, and a document remain just as consistent and useful? I don't think there is much of this. Why? Natural language is glued to the real world in a way formal languages aren't. When it comes down to it, language is bolted onto us as animals - there can't be the same symmetries amongst concepts that control life and death. In other words, perhaps a good way of creating an AI is to, say, base it around a program that wants to be happy! :p
Anyway, there are some symmetries in the real world, not just in language, but in our perception of the world itself. For example, I may perceive the colour red to look like what you think of the colour blue, and vice versa. As long as we're used to the colour schemes of the world, and have a consistent naming convention, we'll never know. Moreover, if my left-right perception of the real world were reversed, I'd have no way to tell. This one seems more odd to me, as there is (arguably!) an orientation to the real world, but I can never know if I'm seeing it correctly.
Despite these mappings of the real world to our perceptions being arbitrary, they're so taken for granted that people hardly notice they're there. What is the symmetry-breaking mechanism here? Is it really arbitrary? Are these different mappings truely isomorphic, or do the variations change our perception of the world? I have no idea!
Permalink. Posted 20:52, Tue, 17 May 2011.
Miranda Elizabeth Patricia Frankau
Miranda Elizabeth Patricia Frankau was born on Friday 6 May 2011 at 16:32, weighing 2950g (6 pounds 8 ounces). Mother and daughter are doing well, and the whole family are overjoyed. Yay.
Permalink. Posted 08:48, Tue, 17 May 2011.
Differential geometry, category theory and typing
NB: This entry involves hand-wavey maths-abuse.
I've been reading a book on differential geometry. It's the first maths book I've read (other than an introduction to category theory, which is kinda cheating) that actually has commutivity diagrams. I didn't really see why category theory might have come out of geometry, but it does seem far more understandable, having seen all these maps between different spaces, and mathematical equivalents of 'if you do this in over there, it's just like doing that over here'.
So, geometry's given category theory to computer science. I'd love it if computer science could give really clear types and anonymous functions to geometry. For example, what actually is a differential form? To keep it simple, let's look at a differential 1-form... these are standalone things like 'dx'. A-level maths said that a 'dx' isn't an actual thing, it's just a bit of notational convention in an integral, be careful, etc. At this level, it's made into a thing...
So, what is it? Let's say we have a manifold U. At each point of U we have a tangent space, which is a vector space of all the tangents at a particular point. How do we tie this tangent space to the underlying manifold? We can make it the space of functions which take derivatives in particular directions - i.e. the things normally written 'd/dx', for some direction x. That is, it's a vector space of functions from points to real values to points to real values. In Haskell type terms, this is a space of (Point -> Real) -> (Point -> Real). Given each point in U has its own tangent space, the overall type is Point -> (Point -> Real) -> (Point -> Real). Of course, in the text the parameters and abstractions and all the rest of it are done in implicit mathsy form, which I think makes things a little more obscure for no good reason.
What's a differential 1-form? It's a member of the cotangent spaces - the dual of the tangent spaces. Does this mean it's of type (Point -> (Point -> Real) -> (Point -> Real)) -> Whatever? No, it's actually a function from point to member of the vector space that is the dual of the tangent space at that particular point. In other words, this whole dual business is being done in a lifted world, if you were writing Haskell. The type is Point -> (Point -> (Point -> Real) -> (Point -> Real)) -> Whatever. What is 'Whatever'? Well, it's the scalar type of the vector space we're looking at, so for a fairly dull manifold, it'll be Real: Point -> (Point -> (Point -> Real) -> (Point -> Real)) -> Real. Still pretty impenetrable, but perhaps the start of an explanation.
What is 'dx' supposed to be? It's supposed to be the thing that when given a member of the tangent space projects out the d/dx component of it. How can we do that? Well, I guess we can just make a function which is linearly increasing in the direction of interest, but is otherwise flat, feed it into our derivative-taking function, and then evaluate it at an arbitrary point (the resulting function should be the constant function, for a given tangent space):
diff_1_form point tangent_space =
tangent_space_elt my_fn zero
where
tangent_space_elt = tangent_space point
my_fn = -- Function linearly increasing in our preferred direction
zero = -- Zero element of vector space
Other function could be substituted in, I guess. Is this correct and accurate, or have I missed something? I'm not sure. For all the formal notation, I'm still not sure if I've extracted the true meaning correctly!
Update! More re-reading reveals further details. The smoothly varying tangent space thing is called a vector field, and the point used to select the derivative-taking functor from the tangent space is the same point used to evaluate it. That is, the overall type is, depending on how you look at it, (Point -> Real) -> (Point -> Real), or Point -> (Point -> Real) -> Real - that is, you can look at it as a thing that transforms functions of the form (Point -> Real), or as, for each point in the manifold, a thing that maps functions to real numbers (specifically, the derivative of that function in a particular direction at that point).
Interestingly, the second type means that the tangent space is effectively over certain functions of the type '(Point -> Real) -> Real', which already looks like the dual of some vector space of functions of type 'Point -> Real'. Sticking together the element of the tangent space with the cotangent space is then basically the idea of passing this function to the tangent space functor. Indeed, the definition of the differential operator 'd' is exactly that:
Given a function f : Point -> Real, and a vector field X : (Point -> Real) -> (Point -> Real), d : (Point -> Real) -> ((Point -> Real) -> (Point -> Real)) -> (Point -> Real) is the surprisingly obvious ((d f) . X) x = X(f). (No, I'm still not sure I've got it exactly right!) For things beyond 1-forms it's more complicated.
Permalink. Posted 12:08, Fri, 22 Apr 2011.
The Universe as a Fixed Point
More random silliness. The common model of a the universe at a point of time seems to be based on the prevalent technology. So, in the mechanical past the universe is 'like clockwork'. Nowadays there is a bit more thought of the universe algorithmically - leading to things like Tiplerian nuttiness, The Matrix and Wolfram's New Kind of Science (I think - I've not read that tome).
A more general view is of the universe as the solution of equations. In some ways, this is not entirely dissimilar to the algorithmic view. For example, you can solve a PDE abstractly, or treat it as some kind of grid-based evolution of state. However, the equation solution approach is more powerful.
For example, you can be very silly, and extend the Tipler view. If the universe has infinite computational capability, it can simulate, say, the universe. Moreover, the universe we're in could be the result of some simulation. But we can go further than that, and have the universe simulate itself. That is, have the universe be a simulation running inside itself. This is obviously rubbish, if you treat the universe as a naive algorithm.
Another way of looking at it, though, is as a Y combinator attached to the world. That is, the universe is the fixed point of an equation of 'things that simulate themselves'. It simulates itself into existence. It's turtles all the way down. :p
Permalink. Posted 22:13, Wed, 13 Apr 2011.
Optimal memory reallocation
If anything is to demonstrate why I'm unsuited to blog-writing, it's the fact that I've been planning to write this post for about five years, and haven't done so because it's trivial!
When doing auto-resizing contiguous storage (e.g. C++'s std::vector), the normal growth mechanism is resizing by a multiplicative factor - e.g. doubling the size each time. My boss of 5 years ago pointed out that doubling is probably a bad factor. Imagine you have some kind of zero-overhead allocator, and this is the only allocation going on in your address space. The first allocation takes 1 unit, then 2 units, 4 units, etc. The next allocation takes 2^n units, but the sum of allocations so far take (2^n)-1 units. In other words, the new allocation must be placed in a fresh part of the address space, rather than recycling previously-used-and-now-free space. This can be viewed as rather wasteful (as such previous allocations are unlikely to be returned to the OS).
This got me thinking as to what is the optimal allocation resizing factor, in this rather simplified view of the world? We continue to assume multiplicative scaling. The array first takes 1 unit, the first resizing x units. The next resize can't go into the 1 unit space, so we tack x^2 on the end. However, we may be able to fit the next allocation (x^3) into the freed (1+x). Then we free the x^2, and x^4 goes in its place. x^5 is after that, and again we can hope to put the x^6 into the space previously occupied by (x^3 + x^4).
What x is appropriate? In both cases, the equation boils down to 1 + x = x^3. As I am lazy, I fed this into Wolfram Alpha rather than solve it myself. The solution by radicals looks ugly, so suffice to say the result is around 1.32. This is one of those cases where an approximation to e.g. 4/3 would be pretty pessimal, as the allocation then wouldn't fall into the pattern we're aiming for. Given the overhead of allocation, you're probably better off with a growth factor of 1.3 or maybe even 1.25.
At a scaling factor like that, you're going to be spending a lot of time copying data. The reallocation overhead may be a constant factor compared to getting the array size right in the first place, but that constant will be big.
What about if we're willing to do an extra step before we can start recycling memory? The real root of 1+x+x^2 = x^4 is just over 1.46. That's somewhat better. And if we're willing to do more and more steps, what is the limiting factor if we wish to eventually be able to fit the newly allocated array into some previously allocated memory? Why, the golden ratio, of course! (The limit is not a factor of 2, as it might look naively, since you have to keep the previous allocation until copying is complete)
All of this is somewhat silly and academic - in a real environment a growth factor of 2 is probably pretty reasonable in terms of avoiding unnecessary copying. On the other hand, from this we see that a growth factor of 1.5 might not be an unreasonable trade-off, in terms of reusing existing memory if you've got an appropriate allocator. A fun little toy.
Permalink. Posted 22:28, Tue, 12 Apr 2011.
Toddlers play Text Adventures
One of the odd effects of playing too much adventure games is that you can start to look at the world in odd ways. For example, it now seems to me that toddlers treat the world as a text adventure. David's common approach is to go through a think-do cycle, where the 'do' is generally a simple verb-noun phrase. Moreover, it genuinely does appear to be mostly adventure-game-like: Drop ball. Look. Get book. Examine book. Give book to daddy.
I guess it makes sense. After all, they're wandering around a strange world, where they don't know the rules, trying to make sense of it all. They have a limited interface to the world (a small vocabulary, both literally, and conceptually). As with most text adventures, communications with other characters is difficult.
So, my thesis is that randomly wandering around the house, moving objects about apparently aimlessly and doing wierd things isn't actually babyish at all! If an adult were dumped in a sufficiently alien environment, they could do no better....
Permalink. Posted 23:04, Sat, 12 Feb 2011.
Guild of Thieves
Having played Jinxster, I thought I'd have a go at another Magnetic Scrolls game. I tried to get into The Pawn, their earliest game, but got pretty much nowhere - it's lumpy and uneven. I'll probably complete it with a walkthrough for completeness' sake, but it's amazing how they developed in terms of playability. Guild of Thieves, being between the two in age, has a good pile of very reasonable puzzles, plus some really rough playability edges.
I've played it rather slowly and patchily, getting stuck and unstuck. One ofthe problems is realising when you've reached the endgame section. You have to keep checking if the bank is open now, so I'd been ready to start that section of the game for ages before realising I could! This is not made clear from the scoring system, which is pretty patchy, or the presence of a number of red herrings.
So, I resorted to a walkthrough to see what I'd been missing, only to discover... nothing up to the point I'd reached! Still, I was stuck on the coloured dice problem, and the banker's door, and getting impatient to finish, so I took hints to complete them. Not quite as even-handed as Jinxster, but pretty fair if you overlook a couple of flaws.
What next? A bit of Infocom?
Permalink. Posted 07:54, Sat, 12 Feb 2011.
Debugging is Computer Science
An awful lot of computer science has nothing to do with science - it's rather more like applied discrete maths. Oxford call their undergrad degree 'computation', and I have rather a lot of sympathy with this approach.
There are some parts of the subject which really are more like science, with experiments and stuff. Good systems design, for example, where benchmarks are created and implementations profiled, has a lot in common with setting up a lab experiment. However, this stuff rarely makes it into an undergrad course.
This perhaps explains why so few people seem to be able to debug properly, for debugging is science. Actually, it's worse than that. In science, you build a hypothesis, and then create an experiment that tries to disprove your hypothesis - if you can't, you have more confidence in your hypothesis. In other words, you start off with a consistent view of the world, and try to construct tests to extend your understanding. With debugging, your view of the world ('This should work') doesn't coincide with the reality ('It doesn't'). You have to design experiments to clarify where your view of reality differs from reality.
In short, in science you're testing hypotheses, in debugging you're testing assumptions. Either way, a scientific mindset is vital.
Permalink. Posted 22:08, Fri, 21 Jan 2011.
Happy Birthday David!
I don't tend to a) update this regularly, and b) write too much family-personal stuff. So, I shall belatedly state the obvious: David is now over a year old. Nearer 15 months now, in fact. It's been absolutely fascinating (and rather lovely) to watch him grow and learn. Observing him, it does make me rather suspect that all AI researchers have been barking up the wrong tree, but that's another story....
Permalink. Posted 21:39, Thu, 20 Jan 2011.
Mail me at random.user@arbitrary.name.