Thursday, September 8, 2016

More polish…

As the Circus is getting ready to open its gates, the clowns are still polishing their shoes.



The past few days brought some improvements:
  • Clearer vocabulary
  • Improved debian packaging
  • Bug fixes, mainly memory corruption

I found an important bug: nginx + fcgiwrap + armhf: that mixture provides the Circus CGI client a stdout that is not pollable. The way Circus uses libuv needs to be fixed.

It is interesting to note that my tests on amd64 did not raise that concern… Not sure why though. Any explanation is welcome…

I am currently fixing that. Stay tuned!

Thursday, August 18, 2016

Hoist the flag!

Circus is getting ready to ship.

Although I was in holidays and with a lot of real-life hard work, I found a bit of time for circus:
  • Continuous integration on Travis CI
  • Debian packaging
  • A logo! (I designed it, please be indulgent…)
Upcoming work:
  • Try Circus for real
  • Package the web fonts instead of relying on Google's font site
  • Sweep the ring, polish the ringmaster's shoes, and drive bellowing through town: "Tonight! The GREAT circus!! Places for everyone!!"

Sunday, July 24, 2016

A brand new marquee

This week-end was a coding marathon; it was quite worth it.

Circus changed a lot those last few days. Rather than being the draft thing of a few days ago, it is now an actual web application. No need to say how proud I feel.

The latest big changes include:

  • The Post-Redirect-Get pattern is now implemented. This pattern is the current standard web pattern used when posting new data: use a POST to send the data, then redirect to another page fetched by the browser using GET.
    This single change was quite intrusive since it involved deep changes in the handling of the HTTP requests, but also some re-architecturing of the pages navigation.
  • The anti-CSRF synchronization pattern is now checked against a tokens history. By defaut, the 5 latest tokens are kept. This helps with navigation: the browser "back" button is not the greatest enemy anymore. One can also reload a page without being kicked out. Even double-clicks are no big deal anymore.
    The idea comes from Tomcat's CsrfPreventionFilter.
  • The pages underwent a big lifting. They are now prettier. More red noses all around!
  • The database layer is not mixed up with the vault layer anymore. It should help with testing; but it also helps coding by circonscribing the database API use, and removing the small inconsistencies of the sqlite API (viz. the fact that data binding is 1-based while data fetching is 0-based).
  • A major security issue was fixed: now the encryption key is really secure. It cannot be decrypted without the user providing their password.
    Previously the data in the database was enough to decrypt the user passwords, defeating Circus' goal.
Note: I may not be able to hack a lot in the upcoming weeks. But I shall be back.

Stay tuned and… Merry hacking!

Thursday, July 14, 2016

Cleaning the cages

The last few weeks were centered on cleaning up the memory leaks.

Valgrind is a great tool; unfortunately it does not work with libgcrypt. Once the libgcrypt code is mocked out though, I was able to find and clean a lot of memory leaks during tests.

On the features front, circus gained a single new feature: the recipes now support size ranges (instead of just fixed sizes). When a range is defined in a recipe, the generated password size will be randomly chosen within that range.

Now that the cages are clean, we can go on adding new features…

Thursday, June 23, 2016

Let the lions out.

Finally! Circus has its core feature implemented: password generation.

The password generator is almost identical to pwd's: number of characters, and four classes of characters: letters, figures, symbols, and free-form (pwd only had the first three forms). The free-form allows to specify the exact set of characters to choose from.

Also in the news:

  • Templates now have "operators". Only one operator is implemented for the moment: count, which gives either the length of a string or the number of elements in an array.
  • An important update in libcad's packaging fixes a long-standing error: the -dbg package did not correctly provide the library's symbols. This fix will enable an accurate check of the memory leaks of Circus, and to fix them.

Hold your red nose, and make the whip crack: the lions roar!

Thursday, June 16, 2016

The circus is back to town.

That's right: the last circus article was posted almost four months ago. Real life took its toll.

So what happened in the scant time I could spare?

The CGI client is now awake and kicking.
It is especially security-conscious: secure cookies, nonce tokens, cache control… The only missing technology is Post/Redirect/Get; I keep that for later as it needs important changes (esp. more roundtrips with the server).
The CGI client uses a code generator that translates a JSON file to web actions. It greatly simplifies adding new pages!

A new manual test script was added. It starts the server and a local web server (lighttpd).

The administration pages are complete (or complete enough for an alpha release).

The user pages are work in progress. They are the core of the system, and I want that part done right! Especially the password generator. It will use a similar algorithm to pwd's, but maybe mix ideas from e.g. pwgen.

There is still a lot of work to do; the clowns still need to apply makeup instead of fooling around. But hearts stay light under the marquee!

Thursday, March 17, 2016

aojls sucks

No circus today.

A few days ago I came across this project: aojls.
The name is catchy: "all other json libraries suck".

Indeed?!

I also happen to maintain a JSON library: yacjp.
The name is maybe more dumb: "Yet Another C Json Library".

No need to say, aojls throws quite a challenge. I had to see if it actually "sucked" less than other libraries, and in particular, than mine.
So I performed a code review, simply perusing the code on GitHub. No, I did not clone it, so maybe I missed some negative points.
As you will see, my answer is quite definite. That... library... sucks at least as much, or more, than alternatives.


Let's start with the few positive points:

  • Separate types for each kind of JSON elements (objects, arrays, strings, numbers, booleans, and nulls). It is also a strong point of yacjp.
  • Hem... No, that's all in fact. The more I look at the code, the less good points I find.

Now, the negative points:
  • Feature creep.
    A lot of functions are defined twice, a normal and a "default" variants. Moreover, all the function names are defined in the global C namespace.
    Yacjp uses Object-Oriented modularity.
  • Memory consumption.
    Ever tried reading a huge JSON object? Aojls, as all other JSON libraries I know of, require the whole data to be read into a char* before trying to parse it; hence, the program potentially need more than twice the memory than the final parsed JSON syntax tree. Oh yes, there is a de/serialize API, but who knows how it works? The API needs the string anyway.
    Yacjp uses a "stream" notion. Data can be read from a string, a file, a file descriptor (i.e. may come directly from a socket or a pipe), etc.
  • O(n).
    Yes, key access to JSON "objects" (i.e. associative arrays) are O(n).
    Yacjp uses hash tables, access to the keys are O(1).
  • Number precision.
    Numbers are represented as double items. That's—in my opinion—the worst idea. Lost precision.
    Yacjp keeps integral values. The numbers are converted to integer or floating-point values only when the actual numeric value is asked for; and the user chooses the type he or she wants.
  • No UTF-8.
    Not standard.
    Yacjp implements a UTF-8 parser.
  • Tokenizer.
    Using a "tokenizer" for such a simple grammar is overkill. Lots of memory—again. And it is implemented as a mess of flags. Gimme a break; a take compilation courses!
    Yacjp implements a simple descending parser, with no separate tokenization stage. The grammar drives the tokenizer at leisure.
  • Memory handling (I).
    One more memory-oriented gripe; malloc() and free() are not customizable.
    Yacjp allows to customize those functions. I use that technique quite liberally in circus; this allows to reuse my components with specialized memory handlers (that lock memory for security).
  • Memory handling (II).
    There is a "context" object that tracks all the JSON data. Freeing that object frees everything.
    Yacjp provides means to cleanup and free data in good order, using iterators. The user chooses what he or she wants to free. A default kill function is provided to delete a complete tree.
  • Error handling.
    An "error happened" function that you may call after parsing to know if an error occurred. If you don't call that function, you don't know that there was an error and the program blissfully continues with invalid data (at best).
    Yacjp uses another approach: error callback; you must provide it, so that you must know that an error happened. Not optional.

On the other hand, yacjp provides solutions to help the user dive into a JSON syntax tree and find the information they want.
The basic structure is the "visitor" (yes, the OO design pattern); above it are provided tools that help lookup data; and tools to free the tree without resorting to that ugly context trick.
And the user can hack more "visitors" if needed, to do whatever they want.

That is user-friendly: make usual operations simple, make non-usual operations possible.

Of course all that is only my opinion; but I guess the lesson is: if you make bold statements, try to live up to them.

Friday, February 26, 2016

makeup and decorations

This week was focused on basic work on libcad, the foundations library I use for circus and other projects.

My objective, for the upcoming weeks, is to code the framework and the first message of circus's client. For that, I need two components. Since those are likely re-usable, I coded those two components into libcad.

Templating engine

Yes, there are already templating engines. But almost none target pure C and those who do have a few shortcomings, such as the inability to define a specific memory handler. I need that for security.
So I started with a well-known spec: Mustache, and wrote cad_stache, the templating engine for Mustache in C.
Soon the ringmaster will proudly show off his mustachio.

CGI framework

In the same way, I found no convincing C library for CGI handling. Most target C++, and none allow for the customization of the memory handling.
So I started from my own CGI implementation in Eiffel, and translated in libcad.
I will also implement fastCGI, later, just to be complete. The library they provide is just awful: macro-ridden, even stdin/out are replaced by macros, meaning that all the components must be aware (i.e. depend!!) on the fastcgi thingy.
Circus will not use fastCGI, because the client is actually meant to die. It's a feature, not a bug.

To be continued…

Sunday, February 21, 2016

And now come the clowns.

After some time spent in real-life, I could at last implement the next admin feature for circus: creating a new user.

When a user is created, he/she is allocated a temporary password (by default valid 15 minutes). The aim is to send an email with that temporary password, that the user must change as soon as possible. (The mail sending is not coded yet.)

This simple spec brought quite a few changes:

  • The users now have an associated email
  • The password validity can now be limited
  • For tests, the "current time" is mocked to make the tests reproducible

Now that the very basic infrastructure is in place server-side, I need to start implementing the client, so that the next features can be implemented "vertically", i.e. with actual visibility.

Happy coding!

Thursday, February 4, 2016

Jumping through hoops

The Circus server is now alive. The best proof is that it can be stopped! Ain't that magic?

So what happened since the last billet?

The server learned to stop. This very first message is important: it means that the server actually listens and is able to understand queries.

Stopping the server is not a light operation, most of all because the server is meant to be controlled by the OS (SysV, whatever). Eventually that operation will become privileged. But for now it fills its role.

The server also learned to answer. The second message, ping, sends a phrase that must be sent back.

While implementing that message, a lot of bugs were shaken out; now the ZMQ layer (channel) looks quite solid and working very well with libuv. I am quite sure of that, because the logger also uses libuv, and log messages are correctly emitted.

The server tests also got better. A small test client already got factorized and will be the basis for the following message tests.

Program for next week

More acrobatics.

I want to look at that multi-user stuff.

The basic thinking is: I would not want anybody just connecting and creating a user, now would I? So creating a user must be a privileged operation, performed by an authenticated and authorized… user…

There are a few possibilities: using PAM, or having an administrator as seed. Maybe pre-creating and filling the database.

Some thinking must happen here.

Merry hacking!

Monday, January 25, 2016

The marquee is up.

Ladies and gentlemen, welcome and take a seat. Free red noses for everybody!

Great news: the circus server process runs!

It does nothing yet except listening on a network socket; but it means that the following pieces are in place:

  • configuration (although the server is designed to work with zero conf)
  • libuv and zmq integration (I am quite happy with that one)
  • logging

The vault has already been designed too. It will have the same concept that drove pwd's: one file. But its structure is vastly different; pwd switched from a home format to JSON. Circus will use sqlite instead, which is the best tool for the job.

I also draw one lesson from the past days: one cannot code without a vision. I had to stop and think for a while. Now I know where I am going!

Here is what I want:
  • Contrarily to pwd, the client expects the server to run. It is not be responsible with starting the server. All operating systems offer tools that are adapted to process supervision. Those tools must be used.
  • That has one important impact: pwd made the assumption that each user would start their own server process at will. Since the server is now started by the system, it must be multi-user.
  • Minimal time/space in which the data transits in clear. That means zeroing data and mlock()ing it. Thanks to libcad the memory handling functions may be simply redefined.
  • Zero-conf by default. Things must work out of the box.
  • OSI-like network layering. Zmq as low-level, and above that message handlers that virtually talk to each other.
  • Keep the red nose. I want a light tone, some humor does no harm.

Now that the server starts, the next step is to make it do something. Dance, play the trumpet, maybe even make the lion jump through hoops?

Happy hacking!

Thursday, January 7, 2016

Fanfare!

Ladies and gentleman,

Let the Ringmaster introduce his new soon-to-be famous Circus!

Circus is the successor of pwd; but it is written in C. I want to use standard and proven technologies!

It is very new for the moment. The code is being built, don't hesitate to come and participate. Red nose mandatory!

Happy new year!