Saturday, 31 January 2009

The next steps for Gamblotron

A couple of weeks ago I bemoaned the lack of development of Gamblotron.  Whilst part of the problem is a lack of time, which still continues, there is also the issue of needing to make some big decisions and not wanting to make my mind up.  The first steps, therefore, to get passed this deadlock is to (finally!) make those decisions.  Decisions are always something I have problems with; always have, always will, probably.

The status so far (which hasn't changed if you've read the previous description): the bulk of the core of the application works - monitoring markets; placing bets; collecting statistics; etc. but the actual profitability is not predictable enough to be relied upon.  Everything is pretty much expected, so far, I knew I was building an experimental system and that's why I made the decisions I did (e.g. interchangable strategy scripts) so that the strategies could be fine tuned.  The unexpected problem which has become apparent is the sheer number of variables, it's practically impossible to hand-tune such things; this is why I began to investigate optimisation algorithms (hill climbing, simulated annealing, etc.).  Gamblotron was never intended for such things, and as such my architectural compromises caused significant performance issues; and my implementation technologies have been less than helpful at parallelising what I have so far.

Rewrite/refactor/patch/hack?  These are age-old questions of software development; the general recieved wisdom is often contradictory, ranging from: "plan to throw one away, you will anyway" to "never rewrite, it's more work than you think".  Personally, my own experience has led me to one conclusion, which is almost squarely half-way: continuous improvment.

The idea of continuous improvement is simple: every time you make a change, make sure the code has slightly less crap in it than before.  Code rot occurs anyway, changing technologies/goals/priorities/etc. all inevitably leaves loose ends which can suddenly turn into bugs, you don't need to add anymore with random hacking.  What this means in practice is: avoid change for change's sake, but when change comes along don't prat about destabilising your entire code-base trying to avoid the inevitable; take the hit there and then, change what you need to change, and refactor everything else to accept it.  Or at an even higher-level: you should always be moving a codebase gradually (i.e. not so fast that everything breaks) to where you want it to be rather than just sporadically reaching everywhere.

How can I apply that to Gamblotron?  It's still not easy.  Much of the current problems are caused by implementation language: Python.  It ain't quick, and it don't do threading; it was a bit daft of me trying to write Gamblotron in Python in the first place really.  I chose it for good reasons, however:

  1. The conciseness of code snippets would allow idea interchanges in public forums (i.e. this blog).
  2. The scripting nature of it helps with my interchangeable strategies ideas.
But, of course, since then reality has interveined.  All this means that if I was starting from scratch now, I wouldn't choose Python; so, according to my law of continuous improvement, I should be porting the code base to a different language.  The question, then, is which?  The python problems would also affect all other languages that exist at a similar level (i.e. non-JITed "dynamic" languages), so that rules out Ruby, JavaScript, and several hundred others.  The traditionally fast languages of C and C++ would be obvious contenders, but they won't help with the "interchangeable strategy" goals, unless I embed a Python interpreter (which I could do...).  This leaves the one-true-way of the modern high-performance VM-based language (i.e. Java).  It has it's knockers, but Java is the only real contender in this sweet-spot of performance vs. ease-of-use.

This still leaves two questions:
  1. I still haven't answered the interchangable strategy script idea.
  2. Isn't rebuilding the whole damned application going to be incredibly time consuming and/or likely to introduce bugs?
The answer to both questions, perhaps, exists in some of the other JVM languages.  How so?  Well, some of them, like Clojure easily win on the expressability front, and inheriting from the Lisp school of thought could be used to create a Gamblotron DSL.  Plus something new like that would be a learning experience, which would mitigate (to a certain extent) the rewriting aspect of my plan.  And re-writing would be no bad thing, if I also took the time to create some unit-tests; the current Gamblotron has a couple of known difficult-to-pin-down quirks, a re-write plus unit tests could track those down quite easily.

Now it's all falling into place, re-writing in Clojure would be quite easy to breakdown into one-hour chunks.  Possibly.  Care still needs to be taken to avoid finding a new problem like before; instead of "seeing where it goes", I need to have a few pre-planned milestones and keep them in mind.  Right then, that can be the first batch of thinking.

1 comments:

  1. It might be a bit late for this, but...

    As I see it, you have two major issues: insufficient analysis of the problem and slow Python code.

    Your optimisation troubles result from the first issue and can't be solved by coding: with 11 variables and little data, you have an ill-constrained optimisation problem.

    The second issue doesn't require you to switch to another language. There are ways to speed up Python, e.g. using Numpy or, if it's not enough, Cython.

    ReplyDelete