Tag Archives: javascript

Node.js programming is different

So you are developing a web app using some Javascript technology on the front end, and you need to build some APIs to reference a data store.  You have no shortage of choices.

You can use PHP,  Java, Ruby, Python, Scala, Go.  With PHP you can use bare metal PHP or you can use a full service framework or a microframework.  With any of the others you have your choice of frameworks, ranging from basic (ie Servlets) to all-inclusive (J2EE, ROR, Django, etcetera)..  And so on.

Your Javascript front end will require some basic APIs to retrieve back end data … a rest API something like /mysite/api/users should retrieve all users, and /mysite/api/users/2 should retrieve user #2. A fairly standardized sort of microframework has emerged to serve up these APIs.  Ruby/Sinatra is the father of these ‘micro framework’ clones:  Python/Flask, Java/Spark, PHP/Slim,  Each is a small framework that provides a simple way to parse URLS, to trigger code in response to said URLS, and to insert ‘middleware’ filters to intercept and process requests.

Node/Express looks very similar, but after the initial familiarity, it presents some challenges that require considerably different solutions from any of the similar tools:  in particular, Node’s primary technical claim to fame is the fact that it handles I/O, including database calls, asynchronously.   This requires significant changes to how programming is done.

In this article I will demonstrate some of these changes by writing an API to back up a JQGrid data grid.

JQGrid is one of the better, if not the best, of the many browser based data grids  available on the internets.  Its demo page http://www.trirand.com/blog/jqgrid/jqgrid.html, gives many examples of its capabilities, along with some PHP examples of the backend code used to provide data to the grids.

JQGrid is a useful as a demo of how node.js programming differs from the other frameworks because displaying a grid requires two queries, one to count the grid rows and one to actually return the data.  Only after both queries return can the JSON used to populate the grid be returned.  In this blog we will demonstrate how to do that.

Out of the box, JQGrid expects JSON with the following layout. Notice that calculating the ‘total’ attribute requires knowing the total number of rows in the table.

{
    "total": "xxx",
    "page": "yyy",
    "records": "zzz",
    "rows" : [
        {"id" :"1", "cell" :["cell11", "cell12", "cell13"]},
        {"id" :"2", "cell":["cell21", "cell22", "cell23"]},
        ...
    ]
}

In the second example provided on the JQGrid demo site, Loading Data/JSON Data, the back end code looks like this:

<?php
...
$page = $_GET['page']; // get the requested page
$limit = $_GET['rows']; // get how many rows we want to have into the grid
$sidx = $_GET['sidx']; // get index row - i.e. user click to sort
$sord = $_GET['sord']; // get the direction
if (!$sidx) $sidx = 1;
// connect to the database
$db = mysql_connect($dbhost, $dbuser, $dbpassword)
or die("Connection Error: " . mysql_error());

mysql_select_db($database) or die("Error conecting to db.");
$result = mysql_query("SELECT COUNT(*) AS count FROM invheader a, clients b WHERE a.client_id=b.client_id");
$row = mysql_fetch_array($result, MYSQL_ASSOC);
$count = $row['count'];

if ($count > 0) {
    $total_pages = ceil($count / $limit);
} else {
    $total_pages = 0;
}
if ($page > $total_pages) $page = $total_pages;
$start = $limit * $page - $limit; // do not put $limit*($page - 1)
$SQL = "SELECT a.id, a.invdate, b.name, a.amount,a.tax,a.total,a.note FROM invheader a, clients b WHERE a.client_id=b.client_id ORDER BY $sidx $sord LIMIT $start , $limit";
$result = mysql_query($SQL) or die("Couldn t execute query." . mysql_error());

$responce->page = $page;
$responce->total = $total_pages;
$responce->records = $count;
$i = 0;
while ($row = mysql_fetch_array($result, MYSQL_ASSOC)) {
    $responce->rows[$i]['id'] = $row[id];
    $responce->rows[$i]['cell'] = array($row[id], $row[invdate], $row[name], $row[amount], $row[tax], $row[total], $row[note]);
    $i++;
}
echo json_encode($responce);

In short, here is what the code does:

 

  1. extracts the variables showing current page, rows per page, index column and sort order from the call
  2. runs an SQL to get total number of rows in dataset
  3. runsanother SQL to get the actual data
  4. create the default JSON string that JQGrid expects

This code is straightforward, and can be very easily translated into any language that can handle a web request. Translating this into a Java servlet, or any of the microframeworks above is very similar, and differ only in language syntax.

A simple translation  will not work in node.  The key difference is how database calls are made. In the examples above, the calls are all synchronous, so we can safely do code like:

  1. Call SQL statement 1
  2. Call SQL statement 2
  3. Process statement results

Because node is asynchronous a straight translation won’t work.  We have to do something to make sure that the statements have completed before we can process the statement results.

At a high level, each query needs to be passed the function to perform when it completes — this is the callback function. Node will run the query asynchronously, will go on to do other things until the query completes, and on completion will finally then execute the callback function. This all requires a different way of programming.

Here are four ways we can program node.js to generate the JQuery JSON:

  1. Force the steps to run serially, pass each step’s data to the next step
  2. Use simple callbacks to run the queries concurrently and process results when they are finished
  3. Use the async library
  4. Use promises (my favorite)

Approach 1: Serial execution through callbacks

One approach — the one the behaves most similarly to the above, is to have the each call back invoke the next call back when it finishes, and pass the result sets through the chain until it is ready to be processed.  Here’s what this looks like:

/* GET users listing. Run queries serially */
router.get('/cpsSerial', function (req, res, next) {
    let page = req.query.page ? req.query.page : 1;
    let rows = req.query.rows ? req.query.rows : 10;
    let sidx = req.query.sidx ? req.query.sidx : 'id';
    let sord = req.query.sord ? req.query.sord : 'asc';
    let count = 0;

    let first = req.query.first ? req.query.first : '%';
    let last = req.query.last ? req.query.last : '%';

    function countQuery(err, first, last, page, rows) {
        if (err) return dataQuery(err, null, null, null, null);
        db.get().query('SELECT count(*) as count ' +
            '           FROM Users' +
            '           WHERE first like ? and last like ?',
            [first, last],
            (err, result) => {
                if (err) {
                    return dataQuery(err, null, null, null, null);
                }
                ;
                dataQuery(err, first, last, result, page, rows)
            });
    }

    function dataQuery(err, first, last, countData, page, rows) {
        if (err) return returnData(err, null, null, null, null);
        db.get().query('SELECT * ' +
            '           FROM Users' +
            '           WHERE first like ? and last like ?',
            [first, last],
            (err, result) => {
                if (err) {
                    return returnData(err, null, null, null, null);
                }

                returnData(err, countData, result, page, rows)
            });
    }

    function returnData(err, countData, recordData, page, rows) {
        if (err) {
            res.send(err);
            return;
        }

        console.log(countData);
        console.log(recordData);
        console.log(`page: ${page}`)
        console.log(`rows: ${rows}`)
        let count = countData[0].count;
        console.log(`count: ${count}`)
        let retval = {};
        retval.page = page;
        retval.total = Math.floor(count / rows) + 1;
        retval.records = count;
        retval.rows = recordData;
        res.send(JSON.stringify(retval));
    }


    let result = countQuery(null, first, last, page, rows);
})

The mainline starts by calling countQuery, which queries for the count and  passes the resuolt to dataQuery.  dataQuery queries the actual data and passes that, along with the count result, to returnData, which builds the JSON and returns the value to the caller.

This approach works but it is not a good use of node.js.  Besides all the hardcoding to control execution sequence, this is forcing a sequence on the queries which we don’t need:  we should be able to run the count and the data queries in parallel, and process only when all results are available.

There are at least three ways to accomplish it:

  1. Code it by hand in Javascript,
  2. Use the async library
  3. Use promises.

Approach 2: Parallel execution using Javascript and callbacks

One way is to use callbacks.  This would work something like this:

  1. Create a queue of the statements to be run
  2. Run the statements
  3. Have each statement do a callback which removes itself from queue, checks if queue is empty
  4. If queue is empty, process statement results

Beyond the concept of callbacks (which is central to node), this approach is initially attractive because it doesn’t require any dependencies, but it can easily explode in terms of complexity as callbacks beget callbacks beget callbacks.  This has its own name: “callback hell”.

This approach also presents challenges in terms of error handling:  because the asynchronous calls have their own call stacks, throwing an error doesn’t work.  Errors have to be passed along.  The most common approach is to pass them along until the final callback function, which essentially does the work of a catch block.

Another challenge in this particular example, is that we have to provide some sort of scope where the results from the different queries can be stored and aggregated.  Since an asynchronous call does not return a value to the calling function, we need somewhere to hold the results of each query.  The most obvious place would be a global variable, but that has problems of its own.

In the code below, we will be using Javascript closures as a means of storing the value.   As each query is processed, it requests a new callback function, which is generated by the makeCallback() function within the fork() function.  Because of Javascript’s use of closures, the all_results variable, which is defined in fork(), will be accessible to the generated callback functions, which will be able to use it to store the intermediate query values.

/* GET users listing. Uses the async module */
router.get('/cpsParallel', function (req, res, next) {
    let page = req.query.page ? req.query.page : 1;
    let rows = req.query.rows ? req.query.rows : 10;
    let sidx = req.query.sidx ? req.query.sidx : 'id';
    let sord = req.query.sord ? req.query.sord : 'asc';
    let count = 0;

    let first = req.query.first ? req.query.first : '%';
    let last = req.query.last ? req.query.last : '%';

    // thanks to slebetman from http://stackoverflow.com/questions/4631774/coordinating-parallel-execution-in-node-js
    function fork(async_calls, shared_callback) {
        var counter = async_calls.length;
        var all_results = [];

        function makeCallback(index) {
            return function () {
                counter--;
                var results = [];

                for (var i = 1; i < arguments.length - 1; i++) {
                    console.log(`i: ${i}.  arguments: ${JSON.stringify(arguments[i])}`)
                    results.push(arguments[i]);
                }

                all_results[index] = results;
                if (counter == 0) {
                    shared_callback(all_results);
                }
            }
        }

        for (var i = 0; i < async_calls.length; i++) {
            async_calls[i](makeCallback(i));
        }
    }

    function query1(callback) {
        let first = '%';
        let last = '%';
        db.get().query('SELECT count(*) as count ' +
            '           FROM Users' +
            '           WHERE first like ? and last like ?',
            [first, last],
            callback,
            (err, result) => {
                if (err) console.log("Error in query2");
            });
    }

    function query2(callback) {
        let first = '%';
        let last = '%';
        db.get().query('SELECT * ' +
            '           FROM Users' +
            '           WHERE first like ? and last like ?',
            [first, last],
            callback,
            (err, result) => {
                if (err) console.log("Error in query2");
            });
    }

    function returnJSON(result) {
                console.log(`result: ${JSON.stringify(result)}`)
                let count = result[0][0][0].count;
                let data = result[1][0];
                let retval = {};
                retval.page = page;
                retval.total = Math.floor(count / rows) + 1;
                retval.records = count;
                retval.rows = data;
                res.send(JSON.stringify(retval));
    }
})

For someone coming the Javascript, this use of closures is novel and perhaps a little difficult to follow.  And the whole book keeping process of tracking which functions have completed is something that could be, and has been encapsulated.  We will see an approach to that next.

 

Approach 3: Using the Async module

The Async module was written to resolve many of the challenges of asynchronous node.js programming.  We can use it to simplify the prior example like so:

/* GET users listing. Uses the async module */
router.get('/async', function (req, res, next) {
    let locals = {};

    let page = req.query.page ? req.query.page : 1;
    let rows = req.query.rows ? req.query.rows : 10;
    let sidx = req.query.sidx ? req.query.sidx : 'id';
    let sord = req.query.sord ? req.query.sord : 'asc';
    let count = 0;

    let first = req.query.first ? req.query.first : '%';
    let last = req.query.last ? req.query.last : '%';

    async.parallel([

            function query1(callback) {
                let first = '%';
                let last = '%';
                db.get().query('SELECT count(*) as count ' +
                    '           FROM Users' +
                    '           WHERE first like ? and last like ?',
                    [first, last],
                    callback,
                    (err, result) => {
                        if (err) return console.log("Error in query2");
                        locals.count = result;
                    });
            },

            function query2(callback) {
                let first = '%';
                let last = '%';
                db.get().query('SELECT * ' +
                    '           FROM Users' +
                    '           WHERE first like ? and last like ?',
                    [first, last],
                    callback,
                    (err, result) => {
                        if (err) return console.log("Error in query2");
                        locals.data = result;
                    });
            }
        ],

        function returnJSON(result) {
            console.log(`result: ${JSON.stringify(result)}`)
            let count = result[0][0][0].count;
            let data = result[1][0];
            let retval = {};
            retval.page = page;
            retval.total = Math.floor(count / rows) + 1;
            retval.records = count;
            retval.rows = data;
            res.send(JSON.stringify(retval));
        });

})

Approach 4: Using Promises

The async library works well, and resolves many of the issues, but there is another approach as well.  We can keep promises.

Promises are essentially placeholders for results.

So, in our example, we want to say:

  1. We promise we will get a result for statement 1
  2. We promise we will get a result for statement 2
  3. When both of these promises are kept, we will process the statementsThe actual code cleanly reflects this approach:
/* GET users listing. Uses promises */
router.get('/', function (req, res, next) {
    let page = req.query.page ? req.query.page : 1;
    let rows = req.query.rows ? req.query.rows : 10;
    let sidx = req.query.sidx ? req.query.sidx : 'id';
    let sord = req.query.sord ? req.query.sord : 'asc';
    let count = 0;

    let first = req.query.first ? req.query.first : '%';
    let last = req.query.last ? req.query.last : '%';

    // get total number of rows in dataset
    let prm1 = new Promise((resolve, reject) => {
        db.get().query('SELECT count(*) as count ' +
            '           FROM Users' +
            '           WHERE first like ? and last like ?',
            [first, last],
            (err, result) => {
                if (err) {
                    reject(err)
                }
                resolve(result)
            })
    })

    // get user data
    let prm2 = new Promise((resolve, reject) => {
        db.get().query('SELECT * ' +
            '           FROM Users' +
            '           WHERE first like ? and last like ?',
            [first, last],
            (err, result) => {
                if (err) {
                    reject(err)
                }
                resolve(result)
            })
    })

    Promise.all([prm1, prm2])
        .then((values) => {
            let count = values[0][0].count;
            let retval = {};
            retval.page = page;
            retval.total = Math.floor(count / rows) + 1;
            retval.records = count;
            retval.rows = values[1];
            res.send(JSON.stringify(retval));
        })
        .catch(err => {
            console.log(`Stuff broke ${err}`)
        })

}

Some useful references:

Node.js Design Patterns – Second Edition Kindle Edition by Mario Casciaro