Julia is a relatively new programming language with the declared goal to become the leading language for scientific computing.
I have probably annoyed half of my colleagues by raving about how great the language is and what it is good at. Before we get to this, and in my defense, let me provide some context. I have been developing using C and C++ for 20 years now, and have been using Matlab and Python for over ten years now. These are great languages and I can be productive using each, infact I continue to use them regularly.
Also, I tend to be quite conservative in terms of adopting new languages or development tools: while learning a new language and environment is fun it also takes a lot of effort and most languages/tools/libraries tend to come and go rather quickly and every developer carries with him a graveyard of tools and languages long gone.
Because of this short-lived nature of software, when someone approaches me with a new language or tool I am skeptical by default, and my litmus test question is usually how confident they are that this tool will still be around in five years time. This is of course unfair, but I prefer to invest my time in learning things that have long term value. Which brings me to the point that I firmly believe Julia is here to stay and in fact may even become a popular language in scientific computing.
Enough rambling, let’s get to the good parts.
I have been using Julia for the last 18 month now, both for work and pleasure. Counting all code I wrote at work (just counting .jl files, no notebooks) I see that I wrote more than 15k lines of Julia code in that time, including several larger projects, ports of existing Matlab and C++ code, and interfaces to C libraries. Given my experience Julia is ready for production in internal projects (as opposed to shipping executable code to a customer) and in particular is very well suited to research-type projects.
Developing code for research projects is in many ways similar to developing other software, but the key difference for me is that I need a quick turnaround time from idea to result not just once but in multiple iterations, sometimes changing the idea and implementation drastically.
In a very real sense most research projects should fail to achieve their original goals; almost by definition research is beyond what is known to work. If you only attempt known-to-work ideas it is not research. If your project fails it is important to learn as much as possible from the failure, that is, increasing the understanding of the problem and finding suitable new research ideas, and quick iterations make this process fun. The new ideas are often variants of earlier ideas and thus can reuse code. If this code happens to be compact and flexible this translates directly into productivity.
Matlab, R, and Python achieve this tight cycle of iterations quite successfully, but in all three languages there is a price towards the later iterations in that for achieving a high performance implementation significant parts of the code needs to be rewritten in a more basic language such as C++, which then needs to be interfaced to the other code through some interface specification. For big high-value projects in industry with dedicated engineering support the additional effort required is typically not a problem, but for individual researchers it means hours and days spend writing additional code without adding functionality.
This process is cumbersome, errorprone, and creates a strong coupling, making further iterations of changing ideas and implementations slower. (As an example, in my grante library I prototyped many algorithms in Matlab, then programmed them in C++, then wrote a Matlab interface which by itself is almost 2,000 lines of C++ code.)
Julia also achieves this tight cycle, but does not require you to resort to compiled statically-typed languages such as C++ in order to achieve high performance. Using a single language maintains productivity both at the very beginning (prototyping) and towards the later iterations (productization).
Productivity in Julia (roughly “scientific results per wallclock developer time”) is achieved through a number of features:
compact syntax, for example I can declare a function using f(x) = 2x+5
.
As mentioned above, I see the advantage of a compact syntax not in the
keystrokes saved initially, but in lowering the barrier to future
understanding and modification as the code evolves.
optional type annotation, the above function will work for x
being an
integer, or a float, or anything that has a multiplication and addition with
integer arguments defined; in fact, I could write f(x::Float64) = 2x+5
to
require that x
is a float, but performance-wise they both yield the same
code. This means that I can be strict about types when I need to be, but have
the feel of a dynamic programming
language.
Jupyter notebook interface for quick
think-implement-results cycles.
excellent default choices of numerical libraries, dense linear algebra,
sparse linear algebra,
numerical optimization libraries,
arbitrary precision computation,
special functions,
FFT, etcetera, most of what you can wish for
in a technical computing environment is already there by default or in the
many numerical packages available. In terms of numerical optimization codes
Julia is probably one of the best environments
available. All these libraries are carefully chosen to be the best-in-class
for the functions that they implement.
foreign function interfaces
to a number of languages:
C and Fortran,
C++ (unfortunately planned only for Julia 0.5),
Python,
R,
Matlab. This makes it relatively
easy to use code in any of these languages and I have used several Python
libraries without issues.
high performance, I regularly find my
first-attempt Julia code for a problem to be an order of magnitude faster
than the equivalent Matlab code. Infact, I unlearned a number of bad Matlab
programming patterns such as using bsxfun
and vectorizing all code. Last
year I wrote Julia code for a R-tree data
structure to maintain a dynamic spatial
index. Doing this in Matlab/R/Python in a reasonably performant way would be
unthinkable! Instead you have to resort to wrapping native
libraries.
In Julia it was fun to write and it is fast, and I could add the required
methods I needed for my application easily, including fancy filtering
iterators.
no separation between user and developer, almost all of the base library
is implemented in Julia itself, and it is easy to find where things are. For
example, if you want to find out how two complex numbers are multiplied in
Julia’s base library? Enter methods(*)
and have a
look!
This transparency makes it easy to learn good Julian style and extends further
to how code is run:
Want to see what machine code is executed when you call the sqrt
function on
a single precision float?
Enter code_native(sqrt, (Float32,))
and see
Almost nothing is hidden from the eyes of the user and this makes it easy and fun to look into the implementation.
Julia, while ready for serious use, is not yet at version 1.0 and lacks several important features. In my work, I found the following pieces missing (as of version 0.4).
Simple single machine parallelism. In C/C++/Fortran this would be
OpenMP and in Matlab it is parfor
.
While Julia does have good support for distributed parallel
computing,
it currently does not have simple single-machine parallelism.
In my experience using the distributed computing abstractions for single
machine parallelism has severe performance overheads because all data is
serialized and remote method invocations are used to execute code.
(Also, I found the use of @everywhere
macros cumbersome.)
Apparently a simpler single machine parallelism is difficult to implement but
in the works, as shown in this recent work by
Intel presented at JuliaCon
2015.
Debugger. Quite simply, a debugger is essential for larger projects where
errors can arise that are difficult to understand and debug without being able
to interactively inspect the context in which the error appeared.
Currently Julia has Debug.jl which
provides debugging at gdb level in terms
of functionality.
But Julia lacks an interactive debugging capability on par with what is
available in Matlab or most C/C++ environments (actually, I am not sure about
Python debuggers here, is
there a single popular tool?).
As far as I understand, this is planned for the 0.5 version of Julia.
Shipping/productization/static-compilation. With this I mean the ability
to select the distribution mechanism for the software, in particular to select
whether all dependencies are included so that the software “will just run” on
the target system, and whether binaries or source code is delivered.
For most researchers and open-source programmers this is not an issue and the
Julia package system caters for all their needs, but I found it relevant in a
company environment because explaining to someone how they install Julia and a
piece of code takes a while, whereas for C++ I can typically easily send an
executable file and some library dependencies.
As far as I understand, static compilation is planned for a future version of
Julia.
If you want to give Julia a spin, here are a few links:
Packages which I use frequently and can recommend:
The Julia package ecosystem has a lot more packages, so if you are looking for a particular thing, have a look there.