The Singleton Pattern
I stumbled upon the singleton pattern while reading a PHP book and fell in love with its simplicity. I rushed to use it at every single opportunity I got (when you have a hammer, everything looks like a nail, right?) until I got tired of it; now I can’t really remember when I used it last. Why? Well, there has been no reason to. Singletons have been the topic of heated debates over time, some people even consider them to be an anti-pattern! But first, what is this pattern all about?
The Singleton pattern is a design pattern that ensures that only one object of a class exists. Singletons are best used when you need to have a single object control simultaneous access to shared resources by other objects. If this conditions are not met, you should consider another design.
So how do I create a Singleton?
Blosch in Effective Java listed these two ways: exposing a member variable and using a static factory.
Using a public member variable
public class Singleton {
public static final Singleton instance = new Singleton()
private Singleton() {…}
//other code
}
Using a static Factory
public class Singleton{
private static final Singleton instance = new Singleton()
private Singleton() {…}
public static Singleton getInstance(){
if (instance == null)
instance = new Singleton()
return instance
}
//other code
}
Why are Singletons used?
A good application of Singletons is in logging; you’ll probably want to log all events from the various classes in your project to a single location. So it is best to use a singleton to write, synchronize writes and close the file. They can also be used to read static configuration files at startup and provide all other classes with initialization details.
Great! So why the fuss about Singletons? Well, Singletons (like all other misuses occurring in software engineering) are not the problem; people are. When used by people without a strong grasp of OO concepts, they can turn into humongous monstrosities that cause unnecessary restrictions and bring in global state. Worse still, they’ll be used when they is absolutely no need for them or what they were not designed to do.
Enough talk about ugly code… Here are some of the issues:
- Memory Management: How does garbage collection occur? Well no one knows that the single instance has not been used for a long loooong loooooooooooong time. And how do you know if some remote class has a reference to the single instance?
- Singletons are hard to subclasss; it is pretty difficult to override the default initialization sequence.
- Worse still, try handling multithreaded singletons.
- It is difficult to test Singletons and they can cause code to become tightly coupled.
So you might thinking Singletons are evil now; no not quite. Singletons pros include:
- Singletons can implement interfaces and inherit from other classes.
- They can be lazy-loaded (most times they are); this is great for expensive resources.
- They can be easily converted into factories.
Should I use a Singleton?
I have only one thing to say; Learn about it well enough else you’ll be using a hammer to drive in a pin!
Data Structures for Programmers
A good understanding of data structures separates the master programmer from the noob. Some make it possible to get faster solutions to problems; and who doesn’t want faster code?
1. Heaps
In heaps, child nodes are always larger than their parents so the smallest element is at the root of the tree. (Max-heaps are the converse, child nodes are smaller than their parents and the largest element is at the root of the tree).
Heaps are good for operations that involve repeated removal/retrieval of a maximum/minimum value from a collection of elements. If you encounter these scenarios; you’ll do better to use a heap which gives you the desired max/min element (using max-heap/min-heap respectively) in O(logn) time. This is definitely better than the O(n) time you’ll get with other data structures like lists, sets and hash tables that require you to search through all elements.
Heaps are majorly used in graph algorithms (a famous example is Dijkstra’s), heapsort and priority queues. Want to implement your own heap? You can use arrays but watch out for issues when you want to dynamically increase the size of the heap.
2. Stacks and Queues
These are fundamental data structures and have been around for long. The stack’s distinctive characteristic is that the last element put into the stack is also the first to be popped off the stack. A queue is a queue; visualize a queue of people; yes that is it! Data is appended at the end and popped off the front.
The stack is referred to as Last In, First Out (LIFO) while the Queue is First In, First Out (FIFO) [Sounds similar to GIGO right?]. Applications include reversing collections, parsing, evaluation of prefix and postfix expressions, compilers and program counters, depth first search, breadth first search, etc.
Got the coding skills? Implement a stack using a queue or vice versa!
3. Hash Tables
Hash tables map keys to values and are excellent for retrieving values in constant time O(1); absolutely better than the O(n) for arrays/lists. The main advantage of hash tables is speed and they can be used to significantly improve the performance of many programs.
Well, hash tables are no silver bullets and have the downside too (increased computation, storage and handling of collisions in keys). To use them efficiently, you have to know how many items you need to store, how often you’ll need to update and search for items and the distribution of keys.
When you need to make 300 thousand lookups in a 100-item dataset every second on a 5MB processor; please use a hash table; don’t use an array.
4. Graphs and Trees
I used to think graphs were 2-D/3-D plots showing the relationships between some variables…
Graphs/Trees are sets of objects with relationships between objects represented as links; these objects are called nodes/vertices and the edges are called links. Trees differ from graphs in two ways: nodes in trees have at most one parent and trees do not contain cycles. As such, every tree can be said to be a graph although the converse does not hold.
There are lots of things that can be modeled as graphs in the real world. Social networks (I think Facebook is an undirected graph while Twitter is directed), road networks, Chess solvers, Sudoku solvers, finding the shortest paths between two points/cities, solving mazes, representing files and directories structure (trees) and lots of others.
I don’t like graphs because they tend to be tricky; I have to do all sorts of mental acrobatics trying to imagine all sorts of weird scenarios and expansions of the graph. Imagine trying to find cycles of a certain length in a graph, “mind bending it is…” (” Yoda’s voice”)
Did I miss anything? Arrays?! Well, you weren’t expecting me to write about that, were you?
Well, arrays have constant-time access, are space-efficient and occur contiguous blocks of memory. On the flip side, they do not grow dynamically. Other interesting structures are binary trees, binary search trees, red-black trees, b-trees, treaps, tries etc
Lastly, please don’t use an array in situation where a hash table, graph or heap would be better; it doesn’t just fit.
Still interested? Search for the following: string data structures, set data structures and geometric data structures.
Got some comments? Share below..
Beautiful Code 1 : 5 Symptoms of Software Rot
I used to wonder why people would refer to software development as an art; to me there was absolutely no correlation between programming and art. However, after hacking at software for years and writing all sorts of software: crappy ( I bet I’ll would hide my face in shame if I see some of my old code), good and ugly, I believe it is an art.
Well-written software is actually beautiful and infact some complex interwoven software systems are classics and should be up there with monuments like the tower of Pisa. The beautiful code series will be mostly focused on software architecture, design practices and common pitfalls; I hope it helps to reduce some of the hassles of development as well as improving the quality of produced code.
Can software rot? Well, not physically but its quality can degrade over time so severely that a complete redesign is the only way out.
1. Rigidity
Rigid software is difficult to change. If your codebase is so rigidly coupled that changing a single class requires you to make changes to other classes in your software; then something is WRONG and this is a classic ingredient of software rot.
Redesign your architecture; separate components cleanly and think deeply about your design before you jump into code. Use loosely-coupled modules that can be modified without breaking some parts in the system. This will make development easier for you and make your code easily extensible.
For example, you can move configuration options out of code and into a plain text, XML or JSON file.
2. Monolithic Multi-Function Classes
A car class does not have to drive; wash, repair, refuel and get its tyres all by itself.
Please stop writing humongous monolithic classes that have 170 variables, 400 constants, 978 methods and do 7000 things at once! (Well, this is an exaggeration; but I hope you get what I mean
).
An essential part of good design is breaking down software into discrete components that can be integrated to form a working system. You decide how best to create your classes; there are already enough wars in the software architecture over whether a class should have only one responsibility or not and I am not ready to start another here.
I have to repeat this; a humongous mess is just not worth it! Refactor your code, pull it apart into separate classes, redesign, do anything but please don’t bundle it into one single bunch.
3. Needless Complexity
Don’t write software that does more than it is expected to do and obviously not less too. Keep it simple; there is no need to make software more complex than it should be; for example, there is no need to use a relational database system if simple plain text files will do.
4. Repeated Code
Copy pasting code is a TOTAL no no! It is too easy to copy working code into another function; Never do it! It might be a great idea for text processing but doesn’t work it in software development. It just leads to hidden bugs, spaghetti code and unmodifiable code.
Imagine you copy-paste some code block five times and now need to modify that code; you need to make the same modification five times. While this might be easy during production, it is a source of pain during maintenance or extension. A new developer might not know about the four other copies and will wonder why the software breaks after changing one. You bet they won’t have kind thoughts about you.
Simple solution: whenever you feel like copy-pasting; improve your design and refactor your code; somewhere there lies some abstraction you are missing out.
5. Difficult to Understand
Recently while working on an open-source application I had to extend; I came across some terribly-written code. It had all sorts of cryptic single-letter variables and calculations and I could not figure out what the class was supposed to do or how it worked. #painfulCode
It takes extra effort and time to use meaningful names but it is worth the time and effort and you’ll be glad you did it. Else, I bet you’ll swear you never wrote your code when you come back to it in after few days.
; and if you can’t figure out your own code; then who will?
Use meaningful variable names, method names and forget about all those hacks that enable you to do a gazillion things in two lines; no one would appreciate it in production code (yes, they are great for bragging; please use them in YOUR personal projects ONLY) and it leads to software rot.
Well that is it; I hope to continue the Beautiful code series with more examples; while this is more talk and less code; future editions should insha Allah contain more code and show you how some of these things work.
If you disagree with me; please let me know your views in the comments.
The Java Virtual Machine
The Java vision was to empower developers to “write once, run everywhere”. One way of achieving platform independence is to use middleware to mask differences. Enter the JVM; a stack-based virtual machine that uses 32-bit words, performs arithmetic using 2-complement and can execute compiled Java bytecode, typically .class or .jar files.
The beauty of middleware ensures that just about any bytecode can run on the platform; it doesn’t necessarily has to be only Java bytecode. Some popular languages that run on the JVM include Scala, Groovy, Clojure, PHP, Python, JavaScript, Erlang, Ruby and lots more. Security is facilitated by using a sandbox model which specifies what actions running code can execute on a machine. The components of the JVM include the stack, registers and method area.
Registers
Registers are simulated by the java program; some of them are:
Program Counter: Points to the location of the next instruction to be executed.
Optop: Points to the top of the operand stack; identifying the currently executing method.
Frame: Points to the stack frame of the currently executing method.
Vars: Points to the local variables of the currently executing method.
The Stack
This part is responsible for storing method arguments as well as local variables in any method. New method calls result in new stack frames being pushed on this stack. Once a method returns, its stack frames are popped off the stack. Stack frames have three sections: local variables, operand stack and frame data.
The Method Area
This is where the classes used by the currently executing method are stored. It tracks the current instruction and the program counter is set to the next instruction here too. Method area is also shared by all the threads of a process; for multi-threaded processes, monitors are used to ensure synchronization.
The Heap
This is where objects are stored; (if you have ever had to increase heap size for programs with large memory requirements; then this is the place). Whenever you want to create a new object; memory is allocated from this storage.
Given that the JVM only translates bytecode into actions and operating system calls; this is insufficient to run java programs. Typically, the JVM is distributed with standard class libraries that allow you to make calls to the underlying system, do I/O and related stuff. The Java Runtime Environment (JRE) consists of the JVM and Java API classes; the Java API classes are needed to make system calls and also provide the necessary API interfaces to the JVM.
SoYouKnow: The original Sun JVM was written in C++
; moreover you can write your own JVM provided you follow the standards specified by Oracle.
So thats it! Hope you liked it; if you do please share with your friends.
How Skype works
A lot of us use Skype daily but have no idea about how it works. Here is a brief description of the Skype framework.
Skype employs a partially decentralized architecture – a mix of the peer-to-peer and client-server architectures. The client-server system is used for authentication while the peer-to-peer system is used for IP telephony, relaying, indexing peers and file transfers. On top of this is the Skype overlay network which users interact with directly, an overlay network is a virtual network that is formed above and independent of the underlying Internet protocol network.
Skype uses overlay networks to achieve the following:
1. It enables them to design and use their own proprietary protocols and application over the Internet. This enables Skype to encrypt all packet transmissions.
2. Redundant links between Skype peers guarantees robustness. This assures of end-to-end connection even when network resources are severely limited.
3. The peer-to-peer overlay model provides an elegant solution to the scalability challenge, given the huge number of interconnected nodes.
4. Skype uses its overlay network to circumvent problems arising from Network Address Translation traversal and bypass firewalls.
5. Routing and relaying are very flexible; protocols allow peers to autonomously set up links with the best nodes available.
6. The overlay allows Skype to store a lot of offline data in its peers leading to reduced storage infrastructure expenses.
7. Employing peer relays improves the quality of VoIP calls.
However the use of overlay networks can lead to serious security and privacy issues as users can use Skype to breach policy. Some possible reasons why organizations block Skype include the following:
1. There is no way to check user activity on the Skype; thus confidential information can be transferred through Skype. User communication is also encrypted, making it impossible to detect what information exchanges are going on.
2. Also, there is some speculation that Skype PCs can be used to launch denial-of-service attacks because viruses, spyware and malicious code can be transferred across its network. Skype can also become resource-intensive as any Skype user can unknowingly become a super-peer if he has access to enough computing resources.
Hopefully, you know more now about Skype.
Ten Useful Terminal Commands for Developers
The terminal is a very powerful tool and once you grasp its basics; you’ll love it and use it in ways never imagined by the original developers. Here are some really useful commands.
1. Ls
I use this tool everyday and it is a workhorse. It is great for listing directory contents and checking file ownership and permissions. I normally use something like ls -alth, this means:
a – show all file contents (including hidden files)
l – display a long list
h – Human-readable format (for file sizes)
t – Sort the output by the most recently updated file.
Check the man files on ls for more information.
2. Grep
Grep is absolutely one of the most useful commands I have found; it find lines that match a pattern in some input.
For example, if I want to search for the file “username.txt” in a directory containing over 100 files; I don’t have to check each file; ls -a | grep “username.txt” is all I need!
To check if some file has some text:
grep text filename
Grep can also interpret regular expressions. For more information on grep; type info grep.
3. Less
“Less is more”; actually less is similar to the more command – both are text viewers; however less can do ‘more’ than more
. Its improvements including allowing bidirectional movement and faster load times.
I find the key bindings similar to vim’s; you can move through the lines using k & j and search for patterns by typing /pattern. To quit, press q.
4. Cat
Do you want to dump the contents of some file? Just cat filename; cat prints out file contents to standard output and can work on multiple input files too. Some examples:
Concatenate filename1 and filename2, write output into newFile
cat filename1 filename2 > newFile
There is also a tac command too, it reads files in reverse order. If you cat an object file and your terminal starts showing gibberish; type reset.
5. Man
man gives you access to the reference manuals. Use it to look up the syntax and options of command. It is great for finding out new combinations of tricks to use in the terminal too.
6. Curl
Use curl to download files, execute POST requests, upload files to a server and view HTML pages; in addition it can also do proxying and user authentication. Curl supports a lot of protocols (including but not limited to FTP, SCP and telnet). Curl can be very useful when working with Couchdb which actually listens to HTTP methods.
Better still, curl bindings are available for a lot of programming languages such as PHP, Java, Python etc…
Simple GET request to the localhost (if you have a web server running);
curl localhost
7. Tar
Most Linux downloads come in the form .tar.gz; which are zipped archives. To extract, you have to unzip the file to get the .gz off and then untar the tar archive to get the package contents. Tar does this for you; I normally use this command: tar -xzvf package.tar.gz.
x – Indicates extraction
v – Verbose mode
z – Notifies tar that the archive is zipped so it can unzip it
f – Informs it that the filename follows.
To create a tar archive, use the following command tar -czvf java_archive.tar.gz *.java
c – indicates create an archive.
8. Chmod
Chmod is used to change the read, write and/or execute permissions on files. Files are associated with their owners (u), other users in the file’s group (g) and other users not in the file group (o); the (a) tag refers to all users. For example, to make a file executable to all users; I do chmod a+x bash_script.sh
a indicates all users
x indicates executable (others include r and w)
+ indicates permissions are to be added (the – sign signifies the opposite)
9. Sudo
Whenever you get the “Permission denied” message, just prefix that command with sudo.
Sudo allows a user to execute a command as root and requires the user to provide a password. On standalone boxes; it will work just fine; however, it will not work on remote hosts if the user is not listed in the sudoers file.
10. History, ! and !!
The history command prints out a list of commands that have been used; this makes it easier to look for some command that you executed in the past but don’t know how to write anymore. I usually combine it with grep. For example, history | grep ssh.
To repeat one of the old commands, I take the number before that line; lets assume it is 892 and execute !892 and viola! Done!!
!! works in a similar way; it executes the last command. Whenever I get the “Permission denied” message, I usually just sudo !!. To access the latest argument, use !$.
There are more advanced commands like sed and awk for text processing, find and locate for finding files but I don’t use these commands often. I hope you liked the post and found it interesting and useful.
Kindly tell me about your favourite commands.
Related articles
- Introduction To Linux Commands (coding.smashingmagazine.com)
Sorting Algorithms
Sorting involves ordering elements of a collection. For example, dictionaries are sorted alphabetically, numbers lists could be in increasing or decreasing order. Sorting is important and can be applied in various context, especially those that have to do with massive data.
The efficiency of a sorting algorithm is related to the number of elements it works on. A complex sorting algorithm may not be worth the trouble when applied to small collections; however its performance on huge datasets might exceed that of simpler algorithms.
Here is a brief explanation of the bubble, selection, insertion and merge sort algorithms.
- Bubble Sort
This algorithm makes multiple passes through a list, comparing adjacent items and swapping those which are out of order. The very first pass through the list places the largest element in its right place and subsequent passes put the next largest values in their places too. That is, elements ‘bubble’ up to their appropriate locations.
This is a very inefficient algorithm due to the large number of exchanges and comparisons it carries out. Consider sorting a list of n elements; on the very first pass, bubble sort make n-1 comparisons because there are n-1 possible pairs of elements; there are n-2 comparisons in the second and a complete sort requires (n-1) + (n-2) + … + (1) comparisons; the sum of this series is (n*(n+1)/2). This makes the bubble sort of order O(n-squared) .
- Selection Sort
The selection sort is also O(n-squared) but it performs better than the Bubble sort. On every pass, it selects the smallest (could be largest too) element in the list and then swaps it with the element at the smallest (or greatest) index in the list. Next it finds the next smallest number and swaps it with the element at the next smallest index. It continues this process until it walks through the entire list.
A flaw of selection sort is that if can not detect already sorted collections.
- Insertion Sort
The insertion sort works in different way and outperforms the selection sort. In each pass; it maintains a sorted sub-list in the lower positions of the list. Consequently, new items are inserted into their proper place in the sorted sub-list.
n-1 passes through the whole list are needed to sort the entire list; on each pass, a linear search is made to decide where to insert the new element. This leads to the O(n-squared) running time.
Interestingly, the insertion sort can be made to run in O(nlgn) time. Since the sublist in lower positions is always sorted; if a binary search is used to find the insertion point in this sub-list instead of linear search., the number of comparisons reduces to lg n.
- Merge Sort
This uses a divide-and-conquer strategy to improve performance. The merge sort is a recursive algorithm that continuously splits a list of items in two. The base case is reached when the list has only one item – a list with one item is sorted anyway – and once this happens, it recursively merges the splits using the merge operation.
The merge sort runs in O(nlgn) time and outperforms the insertion sort when n is sufficiently large, when n > 40.
Can you suggest other sorting algorithms?
And you thought you knew about Open Source
Popular opinion about open-source software – having access to the source code – does not adequately express the meaning of free software; it is even weaker than official definition of open-source as it includes lots of propriety and/or commercial programs.
Free software does not imply that software is available at zero cost; it is possible to pay money to buy open-source software; however you are FREE to modify and change it and even possibly sell copies! A free program must be available for commercial use, development and distribution. Strange? Think Red Hat Linux, Red Hat Linux is open-source software but not freeware.
Freeware expresses the concept of not charging for use; this is the meaning of “free as in free beer”; freeware however does not guarantee access to the source code
. By not using open standards, it is easy to lock in users (who might be attracted by the zero cost) and prevent them from moving on to other software later.
On the other hand, “free as in free speech” means you get access to the source code and can change it as you will. According to Richard Stallman of FSF and GNU fame, free software can be used, studied, distributed, changed, copied and improved by users. Fortunately, most open-source software are freeware. Users of open-source software don’t have to worry about what happens if their software gets bought out; it also assures of some form of updates as anyone can improve the software.
Here are some of the most widely-used licenses
- General Public License (GPL) – Anyone who obtains software licensed with the GPL has the right to get the source code along with the software, create anything they like and redistribute it under the same GPL.
- Mozilla Public License – This gives users the right to modify your software. However ,they have to release all your files with their software but do not have to release the files they created from the scratch.
- Lesser General Public License (LGPL) – Similar to the GPL, but gives people the flexibility to use open-source software in their own projects without releasing all the source code to the world.
- Apache 2.0 License – Users can use the source code as they will provided they include the copy of the license in their distributions and use the proper attributions.
- BSD License – A very open license, allowing users to do practically anything with the software. All you have to do is include the copyright, conditions and disclaimer; also you can’t use the name of the originating organization to promote your edit without written consent.
- MIT License – This is similar to the BSD License, but even more permissive.
- Public Domain – This is completely permissive; anyone can do anything they like with these software as there is no copyright. SQLite is a popular example of public domain software.
Next time you get free software, find out what kind of ‘free’ it is!
Related articles
- Dear Open Source Free Loaders (jeffhoogland.blogspot.com)
- Principle of copyleft (janaimke11.wordpress.com)