Thursday, March 17, 2016

aojls sucks

No circus today.

A few days ago I came across this project: aojls.
The name is catchy: "all other json libraries suck".


I also happen to maintain a JSON library: yacjp.
The name is maybe more dumb: "Yet Another C Json Library".

No need to say, aojls throws quite a challenge. I had to see if it actually "sucked" less than other libraries, and in particular, than mine.
So I performed a code review, simply perusing the code on GitHub. No, I did not clone it, so maybe I missed some negative points.
As you will see, my answer is quite definite. That... library... sucks at least as much, or more, than alternatives.

Let's start with the few positive points:

  • Separate types for each kind of JSON elements (objects, arrays, strings, numbers, booleans, and nulls). It is also a strong point of yacjp.
  • Hem... No, that's all in fact. The more I look at the code, the less good points I find.

Now, the negative points:
  • Feature creep.
    A lot of functions are defined twice, a normal and a "default" variants. Moreover, all the function names are defined in the global C namespace.
    Yacjp uses Object-Oriented modularity.
  • Memory consumption.
    Ever tried reading a huge JSON object? Aojls, as all other JSON libraries I know of, require the whole data to be read into a char* before trying to parse it; hence, the program potentially need more than twice the memory than the final parsed JSON syntax tree. Oh yes, there is a de/serialize API, but who knows how it works? The API needs the string anyway.
    Yacjp uses a "stream" notion. Data can be read from a string, a file, a file descriptor (i.e. may come directly from a socket or a pipe), etc.
  • O(n).
    Yes, key access to JSON "objects" (i.e. associative arrays) are O(n).
    Yacjp uses hash tables, access to the keys are O(1).
  • Number precision.
    Numbers are represented as double items. That's—in my opinion—the worst idea. Lost precision.
    Yacjp keeps integral values. The numbers are converted to integer or floating-point values only when the actual numeric value is asked for; and the user chooses the type he or she wants.
  • No UTF-8.
    Not standard.
    Yacjp implements a UTF-8 parser.
  • Tokenizer.
    Using a "tokenizer" for such a simple grammar is overkill. Lots of memory—again. And it is implemented as a mess of flags. Gimme a break; a take compilation courses!
    Yacjp implements a simple descending parser, with no separate tokenization stage. The grammar drives the tokenizer at leisure.
  • Memory handling (I).
    One more memory-oriented gripe; malloc() and free() are not customizable.
    Yacjp allows to customize those functions. I use that technique quite liberally in circus; this allows to reuse my components with specialized memory handlers (that lock memory for security).
  • Memory handling (II).
    There is a "context" object that tracks all the JSON data. Freeing that object frees everything.
    Yacjp provides means to cleanup and free data in good order, using iterators. The user chooses what he or she wants to free. A default kill function is provided to delete a complete tree.
  • Error handling.
    An "error happened" function that you may call after parsing to know if an error occurred. If you don't call that function, you don't know that there was an error and the program blissfully continues with invalid data (at best).
    Yacjp uses another approach: error callback; you must provide it, so that you must know that an error happened. Not optional.

On the other hand, yacjp provides solutions to help the user dive into a JSON syntax tree and find the information they want.
The basic structure is the "visitor" (yes, the OO design pattern); above it are provided tools that help lookup data; and tools to free the tree without resorting to that ugly context trick.
And the user can hack more "visitors" if needed, to do whatever they want.

That is user-friendly: make usual operations simple, make non-usual operations possible.

Of course all that is only my opinion; but I guess the lesson is: if you make bold statements, try to live up to them.