Thesis Stories Ep 3: Research is Hard!
Alhamdulilah I completed my thesis about three weeks ago; if you’re interested, you can check out my thesis and presentation. Looking back at the two years I spent at MASDAR, I have a couple of thoughts: Alhamdulilah I learnt a lot, met a couple of wonderful people and matured significantly. There were a couple of not-so-pleasant experiences too but I believe I emerged stronger ultimately.
So I switched to the complexity analysis of road networks after my stack overflow adventure ended unsuccessfully. It was a fresh start but I had no alternative since I wanted to graduate. In the end, I defended all my hard work in about 75 mins – imagine! Nearly 6 months of work translating into just 75 mins!!
Research is difficult! As difficult as any other endeavor; I think most researchers don’t know how their efforts will turn out (as most start-ups do at the beginning too). There is usually some hunch about a model, some experiments and then eventually they have to figure out what the ‘right’ result is. Also, ‘big data’ appears to be fun and cool but it requires unbelievable and prodigious amounts of grunt work.
I built JIZNA, a custom Python framework for complexity analysis. JIZNA can parse openstreetmaps XML dumps of cities (the parser was an open-source utility I found and modified), create dual graphs of these networks, merge discrete roads, exclude outliers and calculate the desired metrics. These metrics were used to predict how difficult it would be to search the city. The JIZNA platform is available here.
The Cool Stuff
I think I wrote much better code: the framework was modular, nicely designed and flexible; I was able to write some really cool algorithms for the complex computations and I learnt how to use Sphinx, the Python documentation tool. Sphinx, in my opinion, is a lovely tool once you grasp its basics.
The Not-so-Cool
I got a couple of interesting results however I think they were not so so spectacular (which is quite disappointing given the amount of time, determination and effort I invested). I guess further work would reveal some new insights.
I had to throw away some of my code (a complete simulation framework had to be discarded when the approach changed) and my writing (again! This is the umpteenth time I’d be chopping off my writing).
So what did I learn? Lots more Python, algorithms, software design, documentation, writing, latex, vim and some maths (mostly matrix algebra). However, more importantly, I came to appreciate the value of grit, determination and perseverance while working towards goals. Don’t ever give up, even if all appears to be lost.
Next plans? I don’t quite know fully yet; one thing for sure: research is hard!
Did you like this post? Check out my other posts on Grad School.
For Devs only
I try to do less to achieve more – it is good; it makes me do my job faster and more easily; you should do so too. Automate, use shortcuts, innovate; well the initial investment might take a lot of time but it’s something you will be glad you did.
You can learn a lot by just exploring the tools you use daily. Just try to find a simpler way: some previously unknown feature in your favourite IDE, a macro to allow you repeat a series of actions or just something that saves you some keystrokes. So, I decided to post a couple of tips; I do hope you find them cool and pick up one or two new ideas.
Warning: Some of these work on linux only.
1. Bash Shortcuts
I stumbled upon these while going across someone’s .bashrc file and it stuck me as awesome; yes it is plain awesome. For example, accessing my /var/www/my-web-project folder (and I do access it often) is a pain; why do I have to type cd /var/www/my-web-project all the time? Solution? Just add the following shortcut to your .bashrc:
alias myproj=’cd /var/www/my-web-project’
and that’s it; I just type myproj in bash and am routed to that location. And yes, it doesn’t matter what directory I am in.
2. Bookmarks
Bookmarks are cool, I think they are absolutely essential. Now, if you don’t know how to use them or don’t use them in your IDE/Editor; stop reading and find how to use this wondrous time saver. I use bookmarks a lot in vim and Netbeans (and all other editors if possible); this allows me to jump around and move around. Yes, find out how to use bookmarks and use them!
Talking about vim; I think it’s really awesome and allows you to effortlessly edit anything that is text: I use it for code, latex, writing anything everything. At times, it might be easier to use a less powerful editor however most times, it does all and really cool too. And if you’re a ‘vimmer’, then try out these plugins: nerdtree, nerdcommenter, syntastic and latexbox (if you write in latex).
3. Debugger Statement (JavaScript)
I picked up this while watching an EmberJS tutorial and it has become a major part of my arsenal. Maybe I overuse it (well, I have to write so much JavaScript nowadays). Just put a debugger statement in any part of your JavaScript code and fire up that page in your browser; whenever execution gets to that point, it’ll automatically stop. I use this virtually everytime nowadays and for JavaScript devs; it’s a must use. Another cool JS tip involves using the console object; I use console.log too often but there are a couple of other cool functions like console.trace(), console.dir() etc.
4. Picking an object with a particular Characteristic
I picked up this trick while working in Python on my thesis: I often needed to pick up objects with certain properties out of long lists of objects. The interpreter is essential and this is what I do:
for obj in objectList:
if obj.ppty == criteria:
break
obj
Obj is now equal to my desired object.
5. Testing for -1 (Nice Discovery from SO)
I picked up this explanation from stack overflow and couldn’t help sharing it. I do hope you wouldn’t write code like this but it is always great to gain some insight into how things work.
Assuming you want to test if something is equal to -1; in 2′s complement systems, the representation of -1 is 11111111 (for whatever number of bits the system supports). Just take the bitwise inverse of that number and all you get is zeros. Now for the cool part, zero is a falsy value which can be used in your tests. Here is the link.
Hope you picked up something or the other from this; if you did; do share it with someone who might pick a point or another from it.
Did you like this post? Try out some of my other posts:
Static and Instance Methods in JavaScript
I thought I quite ‘understood’ inheritance in JavaScript until I got flummoxed while trying to test my understanding. The JS prototypical inheritance model is hugely different from the classical approaches of the languages I started out with; the only way to fix this that I know of is by writing code and after spending hours screaming at my console I finally saw the light Alhamdulilah.
JS is a weakly-typed prototypical language and thus classes aren’t really ‘classes’; instead they are functions which are in turn objects. New objects are created from constructor functions by using the new keyword and this allows you to kind of ‘simulate’ OOP. But mind you; its inheritance model is still different.
Some sample code that shows this difference between static and instance properties. All object properties are public although can easily make them private by declaring them with var (I added an example); for these private properties you’ll have to add accessors and setters; read this for an explanation of closures and this pattern.
It’s interesting to see that static fields cannot be accessed from child context in JavaScript (which makes sense); Java, however, allows static fields to be accessed from object context; this is kinda weird as the objects don’t really ‘have’ that variable.
Here’s a quote from the Java Tutorials website:
Note: You can also refer to static fields with an object reference like
myBike.numberOfBicyclesbut this is discouraged because it does not make it clear that they are class variables.
One tip: don’t just read code and assume you understand it. Sure, you can always convince yourself that you ‘know’ it. However, if you really want to KNOW it then read it, write it and then try extend it (if you have the time and enthusiasm).
So, what other differences do you know between classical and prototypical inheritance?
Related articles
- The language series: JavaScript (abdulapopoola.wordpress.com)
- JavaScript, functional programming aspects
Design Patterns: PubSub Explained
I actually wanted to write about PubSub alone: it’s a fascinating design pattern to me however, the thought occurred to me, why not write a design patterns’ series? It’ll be good knowledge for me and give good information. So here goes the first: PubSub.
Introduction
The pattern works using a middleman; an agent that bridges publishers and subscribers. Publishers are the objects that fire events when they are done doing some processing and subscribers are objects who wish to be notified when the publisher is done – i.e. they are interested in the work of publishers.
A good example is that of a radio station where people tune in to their favourite programs. The publisher has no knowledge of the subscribers or what programs they are listening to; he only needs to publish his program. Subscribers too have no way of knowing what goes on during program production; when their favourite program is running, they can respond by tuning in or informing a friend.
PubSub achieves very loose coupling: instead of looking for ways to link up two discrete systems; you can have one hand off messages and have the second part consume these messages.
Advantages
- Loose coupling
Publishers do not need to know about the number of subscribers, what topics a subscriber is listening to or how subscribers work; they can work independently and this allows you to develop both separately without worrying about ripple effects, state or implementation.
- Scalability
PubSub allows systems to scale however it buckles under load.
- Cleaner Design
To make the best use of PubSub, you have to think deeply about how the various components will interact and this usually leads to a clean design because of the emphasis on decoupling and looseness.
- Flexibility
You don’t need to worry about how the various parts will fit; just make sure they agree to the one contract or the other i.e. publisher or subscriber.
- Easy Testing
You can easily figure out if a publisher or subscriber is getting the wrong messages.
Disadvantages
PubSub’s greatest strength – decoupling – is also its biggest disadvantage.
- The middleman might not notify the system of message delivery status; so there is no way to know of failed or successful deliveries. Tighter coupling is needed to guarantee this.
- Publishers have no knowledge of the status of the subscriber and vice versa. How can you be sure everything is alright on the other end? You never can say…
- As the number of subscribers and publishers increase, the increasing number of messages being exchanged leads to instabilities in this architecture; it buckles under load.
- Intruders (malicious publishers) can invade the system and breach it; this can lead to bad messages being published and subscribers having access to messages that they shouldn’t normally receive.
- Update relationships between subscribers and publishers can be a thorny issue – afterall they don’t know about each other.
- The need for a middleman/broker, message specification and participant rules adds some more complexity to the system.
Conclusion
There are no silver bullets but this is one excellent way of designing loosely coupled systems. The same concepts drive RSS, Atom and PubSubHubbub.
PubSub Example (JavaScript)
Taking the PAIN out of coding
Over the years, I have learnt some tricks and picked up some lessons while writing code. Most were learnt the hard way, so I decided to share a couple of tips on how to avoid development pitfalls.
Meticulous Planning and Design
One of the most difficult lessons I learnt in software development was not to rush into code; I used to jump impulsively into software projects and start hacking away at code without planning fully. And as you can bet, the thrill of coding soon evaporated when I got bogged down by messy code. Sadly, many projects of mine met their nemesis this way.
Now, I am just too lazy or maybe too battle-scared to do that; I mostly write out a high level system design document (usually a single page or two) describing all the major features. Next, I run through everything to know that the various components and interfaces are logically valid and try the edge cases. Only when I am satisfied with this do I start writing code.
Alhamdulilah, I think I write cleaner, more modular and better designed code this way. For example, I recently had to extend an experimental framework I wrote a couple of months back; surprisingly I was able to make all major changes in less than two hours. Better still, nothing broke when I ran the framework again!
A dev lead once told me coding is the easiest part of software development… I think I agree with him…
Do it fast and dirty, then clean up
I started using EmberJS last year for a web project. EmberJS is a really cool framework and reduces the amount of boilerplate code you have to write: it’s so powerful that some things seem magical. However, EmberJS has a really steep learning curve.
As usual, I was trying to write perfect code at my first attempt, did I succeed? Your guess is as good as mine. I got so frustrated that I started hating EmberJS, the project and everything remotely related to the project.
Before giving up, I decided to have one more go at it; my new approach involved ignoring all standards and good practices until I got something to work. And that was it, I soon got ‘something’ that looked like a working web application running. One day, while working on the ‘bad’ code, I had an epiphany. In a flash, I suddenly knew what I was doing wrong. Following standards and practices was relatively easy afterwards.
Looking back, I realize that if I was bent on doing it perfectly at the first go I most probably wouldn’t have gotten to this point. Oh by the way, EmberJS got a new release so my code is obsolete again.
Clean up the code from step 2 above X more times
This is a part of development I don’t think I really like but it is essential for maintenance. You have to go back through the code (yes, your code; you ain’t gonna make life miserable for the developer inheriting your codebase). Refactor all duplicated, extraneous and obscure pieces of code ruthlessly. Most importantly, improve the readability of the code (yes, readability is REALLY important – make it read like a good novel if possible à la Shakespeare or Dickens).
I also keep a running list of all the hacks I make as I go about writing code in step 2; this list comes in handy at this stage and enables me to go straight to the substandard code pieces and fix them up.
Use a consistent coding style
I recently noticed that my coding style was inconsistent across my projects: variables names were either under_score or camelCase while method declarations used brace-on-new-line and brace-on-same-line approaches.
The problem with this approach is that it breaks up my flow of thought and makes speed-reading code difficult. Now, I choose a single style and stick to it throughout a project – any style is fine provided I use it consistently.
Scientific Debugging
I came across the term ‘scientific debugging’ while blog-hopping and it has stuck in my subconsciousness ever since. Identifying bugs can be a challenge: for small projects, I just try to figure out where the bug might be and then check for this. However, this approach does not scale, I wouldn’t randomly guess on a 5000 line project.
Scientific debugging is a systematic process: you make hypotheses about the likely causes of the bug, make a list of places to check and then go through this systematically while eliminating entries. You’ll most probably find the bug with less effort and without running through the entire list.
Project Management
I rarely used to track how much time and effort I put into my projects; I would just code and code and code. Now I know better, I estimate how many hours I can put in before, during and after each project. I try to use Agile (although, I use a simple list + pomodoro too) for project planning, task management and effort estimation. It is now trivial looking up my project status: implemented features, issues list and proposed features.
Testing
I tried my hands at TDD last year and I felt it was just a way of doing too much work for coding. While I might be wrong about TDD, I think it’s essential to have a solid testing process in whatever project you’re doing.
Test, test and test: run the gamut (if possible): unit, integration, functional, stress, regression etc.
Enough said… I have dirty code to polish. If you did find some of the points useful, please share your thoughts and ideas.
Related Posts

