Category Archives: Open source

Machine reading of crossword grid

Having spent some time ove the weekend entering clues, solutions and breakdowns of some sample crosswords for the hints database web application that I’m occasionally working on, and really struggling with it taking 30 – 45 minutes to add a complete grid I hit upon the idea of maybe trying to write a program that could scan a photo of the grid, determine the grid layout (i.e, identify the positions of the across and down clues) and then use OCR to detect the solution.

Just as a diversion, you understand. Probably won’t get very far. And I’ve still got a bunch of more relevant feature requests to be working on.

But it does mean that I can do a few blog posts with some cool sounding transforms and other mathematical wizardry without really knowing what I’m talking about (perhaps, I know people that have done this as a career, perhaps not).

OpenCV has python bindings and some good tutorials so it might be worth exploring for a while. And probably not as daft as trying to use image recognition in the garden for a smart scarecrowmagpie.

Python Flask pagination

Flask pagination

The [crossword hints]() Flask application I have been writing has been expanding and I have taken some time to add more clues and solutions such that there is now some need to break up the index list into more manageable chunks using pagination.

I’ve been considering a couple of options:

  • the flask-paginate pip which is modelled on Rails’ will_paginate (which I have used in the past.
  • a Flask snippet from Armin Ronacher, which on the face of it looks like it will be harder to implement;
  • a hand-rolled solution, but why re-invent the wheel?


This should be the preferred solution because it reduces the amount of coding required in the application, but I had a cserious problems when it came to rendering the pagination: it’s missing much of the necessary CSS to display the page listing.

At first, the page listing just appeared as a bulleted list. It turns out that Flask-paginate uses a CSS framework called bootstrap but the documentation
doesn’t include any real mention of it, let alone describing it as essential.

StackOverflow responses to Flask=paginate styling problems just said to install
the bootstrap CSS files, but when I tried this it just completely mangled much of
the other navigation styling I already had. I hate messing around with CSS at the best of times so there’s no way I’m wasting time fiddling around with minimised 3rd-party dependent CSS. Time to ‘git checkout — ‘ the files I’d been working on.

Snippet 44

I was wary of attempting to implementing the suggestion in the snippet, not
because I thought it wouldn’t be any good (total regard for Armin’s code), but because of my lack opf confidence working with what looked like a bit more advanced Python and Flask and whether my ropey application could support it being added without a complete rewrite.
I need not have worried.

Anyway, the task breaks down into the following stages:

  • a pagination class
  • a view helper,
  • routing and rendering: URLs and views,
  • CSS to support a reasonable page layout

Pagination class

First off we need a class to contain properties and methods needed to describe the pagination:

  • total number of records,
  • items per page
  • number of pages
  • to determine whether the current page needs next and/or previous links
  • an iterator over the pages; I don’t think I would have been able to write this
    so concisely, so definitely good value here,
class Pagination(object):
    def __init__(self, page, per_page, total_count): = page
        self.per_page = per_page
        self.total_count = total_count

    def pages(self):
        return int(ceil(self.total_count / float(self.per_page)))

    def has_prev(self):
        return > 1

    def has_next(self):
        return < self.pages

    def iter_pages(self, left_edge=2, left_current=2,
        right_current=5, right_edge=2):
        last = 0
        for num in range(1, self.pages + 1):
            if num <= left_edge or \
              (num > - left_current - 1 and \
            num < + right_current) or \
            num > self.pages - right_edge:
            if last + 1 != num:
                yield None
            yield num
            last = num

I had to make a tiny change to what appeared in the snippet, replacing xrange with just range.

View helper

There needs to be a view helper that can be used in a template macro to generate the link to the other pages.

def url_for_other_page(page):
    args = request.view_args.copy()
    args['page'] = page
    return url_for(request.endpoint, **args)
application.jinja_env.globals['url_for_other_page'] = url_for_other_page

Although this has been added to my main application file it is something that can be relocated somewhere more sensible when refactoring work starts; registering the helper with Jinja this way is certainly something I can try with future projects so this snippet is teaching more than just pagination. Win-win.

Routing and rendering

The trickiest part of the process is probably changing the index route to support displaying a particular page of content.

Default and paged indexes

The easiest way to drive the pagination routing is to associate a page number with the route

@application.route("/crossword-solutions/", defaults={'page': 1})
def crossword_solution_index(page):

The index routing needs to include:

  • a count of the number of items
  • a query that only selects the necessary records for display,
  • a check to capture out of range page requests,
  • create a pagination instance and pass it to the template to render

This gives an index router like:

    count =
    offset = ((int(page)-1) * PER_PAGE)
    solutions = crossword_solutions.raw("""
        SELECT as setter, cs2.solution AS solution, cs2.rowid AS csid, AS soltype
        FROM crossword_setters cs1
        INNER JOIN crossword_solutions cs2
        ON cs1.rowid = cs2.crossword_setter_id
        INNER JOIN solution_types st1
        ON cs2.solution_type_id = st1.rowid
        ORDER BY cs2.solution
        LIMIT %s, %s""" % (offset, PER_PAGE))
    # Display a 409 not found page for an out of bounds request
    if not solutions and page != 1:
        return(render_template('errors/409.html', errmsg="Requested page out of bounds"), 409 )
    pagination = Pagination(page, PER_PAGE, count)

Although I’m using PeeWee as the ORM, I’ve found that queries like this one work better as plain SQL particularly with the offset and limit. What we notice here is that the pagination has no connection at all with the database result set like, say, will_paginate, but still doesn’t lose any functionality.

Pagination macro

I’m still not very experienced with Jinja2 macros so really appreciated the sample provided by the snippet. I modified it slightly to include a previous page link and also display next and previous as grayed-out boxes and the last and first pages respectively.

{% macro render_pagination(pagination) %}

<br />
{% endmacro %}

This is added to an existing macros,html template file.

Template rendering

The macros file is imported into a high-level layout template as

{%- import "macros.html" as f -%}

and child templates apply the pagination with,

{{ f.render_pagination(pagination) }}

Pagination stylesheet

I dislike HTML layout in general and loathe CSS in particular so I will settle for whatever gives a reasonable look despite any ‘obvious’ inefficiencies and any prolonged CSS development will inevitably provoke much cussing and cursing. I have based the styling for my current project around ‘digg_pagination’ styles I came across for old Rails projects with as few tweaks as possible; I really don’t why some settings get included from some sections and not other.

The styling boils down to the following sections (I include it only as an example
and not a recommendation; no-one in their right mind would trust me to design CSS)

.pagination {
margin: 10px 0;
margin-top: 0;
margin-bottom: 0;
font-size: 0.7em;
.pagination ul li {
border-color: #105cb6;
border-width: 0 0 1px;
border-radius: 0;
float: left;
list-style-type: none;
.pagination ul li span.nolink {
padding: 2px 10px 2px 10px;
display: block;
/* float: left; */
margin-right: 1px;
font-size: 8pt;
font-weight: bold;
border: 1px solid #9aafe5;
color: #D5D5D5;
.pagination a, .pagination ul li span {
padding: 2px 10px 2px 10px;
display: block;
/* float: left; */
margin-right: 1px;
font-weight: bold;
color: #105cb6;
border: 1px solid #9aafe5;
.pagination ul li span.ellipsis {
/* font-size: 10pt; */
font-weight: normal;
padding: 2px;
margin: 1px;
border-color: #fff;

Which I think gives a reasonable look to the pagination with greyed out boxes for the prev and next links on the first and last pages and clearly indicates the active page.


I highly recommend as a great place to start if looking at pagination for a Flask application; in general, Armin’s posts here are top-notch and very informative.

htpasswd without Apache

If wanting to restrict access to website content under nginx but don’t want to install Apache use the following to generate an htpasswd file

printf "USER:$(openssl passwd -apr1 P@55w0rd)\n" > /etc/nginx/auth/htpasswd

Then use the following nginx rules to

location /api {
    satisfy all;    

    deny  all;

    auth_basic           "Administrator’s Area";
    auth_basic_user_file /etc/nginx/auth/htpasswd;



Git: stop tracking a tracked file

There are times when a project needs to include the default version of a file in the git repository that will subsequently change to support development (e.g., secret application key or sqlite3 database).

After committing the safe  default copy, any subsequent changes to the files will appear as a modified file in ‘git status’ reports and will also prevent any git flow feature finish operations if the files are not staged for commit.

Adding the files to .gitignore makes no difference.

There is a way, however, to tell git to stop tracking the file,

$ git rm --cached file1 file2

Taken from a StackOverflow posting.

Troubleshooting lifecycle

I seem to have hit on a troubleshooting pattern for trying to get new services up and running: this time Rabbitmq.

Most of the application stuff I have been working with recently has been  disaster (and I know I’m of the opinion that most software – even, particularly my own – is rubbish) having to abandon development with Rails and Django because of basic stuff that just doesn’t work (or fails silently), and I’ve following the message queue posts on and rather than sign up for a CloudMQTT account I thought I’d install rabbitmq locally. It can’t be that hard.

The web pages for Rabbitmq aren’t terribly inviting and I immediately suspect that there will be some winging it.

I installed the packages and start the service and try to use the rabbitmqctl command to see how things are going

# rabbitmqctl status
Status of node rabbit@fnunbob ...
Error: unable to connect to node rabbit@fnunbob: nodedown


attempted to contact: [rabbit@fnunbob]

 * connected to epmd (port 4369) on fnunbob
 * epmd reports: node 'rabbit' not running at all
 no other nodes on fnunbob
 * suggestion: start the node

current node details:
- node name: 'rabbitmq-cli-97@fnunbob'
- home dir: /root
- cookie hash: 63+eNTCkMJ1cMwrAcJ88rg==

Now, first off, ‘* suggestion: start the node’. What’s all that about! I thought I had started the node; there’s nothing I could find (easily) on the website to suggest anything else. Perhaps provide a clue on how to start the node!

Okay, so let’s try starting the node:

# rabbitmqctl start_app
Starting node rabbit@fnunbob ...
Error: unable to connect to node rabbit@fnunbob: nodedown


attempted to contact: [rabbit@fnunbob]

 * connected to epmd (port 4369) on fnunbob
 * epmd reports node 'rabbit' running on port 25672
 * TCP connection succeeded but Erlang distribution failed

* Authentication failed (rejected by the remote node), please check the Erlang cookie

current node details:
- node name: 'rabbitmq-cli-75@fnunbob'
- home dir: /root
- cookie hash: 63+eNTCkMJ1cMwrAcJ88rg==

Huh? I’m beginning to think that Rabbitmq is yet another piece of crapsoftware that just doesn’t work out the box: netstat and ps show plenty pof rbbitmq processes running. But, I’ll try some troubleshooting to figure out what I’ve done wrong.

Now, my troubleshooting pattern is to search (Google – expect Facebook adverts for message queue services in a few days) for the application and problem and ignore the links to the application vendor and start with the first StackOverflow page.

This leads me to and even the link is promising. And sure enough, it mentions the rabbitmq-server command, so we give it a go.

# rabbitmq-server
RabbitMQ 3.6.9. Copyright (C) 2007-2016 Pivotal Software, Inc.
 ## ## Licensed under the MPL. See
 ## ##
 ########## Logs: /var/log/rabbitmq/rabbit@fnunbob.log
 ###### ## /var/log/rabbitmq/rabbit@fnunbob-sasl.log
 Starting broker...
 completed with 0 plugins.

Maybe something’s happening, maybe not: CTRL-C; man rabbitmq-server. There’s a ‘-detached option@. Ah, okay, let’s try that.

# rabbitmq-server -detached
Warning: PID file not written; -detached was passed.

So, let’s see if that makes a difference.

# rabbitmqctl status
Status of node rabbit@fnunbob ...
 {ranch,"Socket acceptor pool for TCP protocols.","1.3.0"},
 {ssl,"Erlang/OTP SSL application","8.2"},
 {public_key,"Public key infrastructure","1.4.1"},
 {asn1,"The Erlang ASN1 compiler version 5.0","5.0"},

That’s more like it.

The important thing here is that StackOverflow is more useful for working with applications than the application documentation itself. Because there are so many thing that go wrong with modern software and they won’t likely be mentioned in teh official docs, SO catches all the efforts to fix things.

P.S. This is one of the most annoying things about Puppet. They must have a deal with Google to make sure only official documentation is returned, it links to the most recent puppet version (and enterprise to boot) and selecting from the version dropdown takes you to a 404 page: Just show the SO pages where people have fixed stuff and be done with it.

Fedora 23: gem install mysql2

I’m having another blast at preparing PDF documents from a database repository of infrastructure assets which requires the use of the mysql2 gem on Fedora 23.

Now, I have had many years experience building this stuff from source and even though Ruby has a reputation for being difficult to work with, this time it’s definitely Fedora  that’s bearing unberable.

Install the mysql2 gen should be a simple matter of,

gem install mysql2

But not when you get this error:

checking for ruby/thread.h... *** extconf.rb failed ***

With extra advice about probably missing developer tools or libraries. I have the compiler, mariadb-devel and ruby-devel packages installed, everything that’s required to build the gem, but still no good.

I eventually found the mkmf.log record mentioned in the error output which contained something I’d not seen before:

error: /usr/lib/rpm/redhat/redhat-hardened-cc1: No such file or directory

Searching online for this came across and the simple solution is to run,

dnf install redhat-rpm-config
gem install mysql2 -v '0.3.16'

And we’re done.

Whod’ve thought that an rpm-config package would be a pre-requisite for installing ruby gems? And yet another example of having to spend an hour fixing  numerous tedious problems and sub-problems introduced by system developers rather than being able to get on with the task in hand.

Arch Linu MySQL (Mariadb) startup

Posted as a quick note-to-self following a delay getting a database up and running for a rehash f an old Rails project.

After installing the mariadb package on Arch, the database needs to be initialised before starting,

# mysql_install_db --user=mysql --basedir=/usr --datadir=/var/lib/mysql

After this, the database server can be started with,

# systemctl start mysqld.service

(Still can’t say that I’m happy with systemd; I just can’t see what problem it’s trying to solve). provided the much-appreaciated help and guidance in this case.

Gramofile reborn

Was intending to spend the day hving fun with Docker on my Arch desktop – the r-pi version still on-hold until I can get a 16GB image file on to the SD card; 24 hours is too long when trying to copy the file over the wireless network – but I got distracted trying to sort out some old clutter from early this century (seriously).

I found a copy of a program I used many moons ago to process some of the digital recordings I had done of some of my old vinyl (prior to processing using Audacity).

Gramofile is  a curses-based program that attempts to split up an recording into different tracks by looking for blocks of silence; it does a reasonable job and its estimates can easily be tweaked.

A quick recompile and it was starting up but was having problems recognising the WAV files I have. I knew it was Gramofile’s problem becasue I was able to use an old program I started years ago to convert WAV files to ZX Spectrum tzx format (yes, I still have plenty of speccy tapes) and it was able to identify the relevant header records.

So, I have spent all day hacking some test programs to get a reasonable header processor going for WAV files. The main problem I suspect is that the code is 32-bit and some of the buffer manipulation looked a wee bit odd. But because the header records sizes are fixed I decided to an explicit copy byte-for-byte from the buffer to the header struct,

 memcpy ( &wavhd.main_chunk, hd_buf, 4 );
 memcpy ( &wavhd.length, hd_buf+4, 4 );
 memcpy ( &wavhd.chunk_type, hd_buf+8, 4 );

Another quick recompile and we’re in business big-time! A run through the first side of Psychocandy picked up 5 out of 7 tracks, just missing a couple of short silence sections between a couple of tracks; th emissing starts and ends can easily added to the .tracks file; just remember to adjust the ‘Number_of_tracks’ setting!

Now, I just wish there ws a quick way to generate the CD text data when ripping the tracks to a CD as this is a right pain with the burning tools, but all in all a good day’s work and I have a fair few album recordings to catch up on.

References was really useful in helping me make sure that the correct fields and their sizes were being used in the WAV header.

More Desktop Linux Loathing

Have found an old 80GB ipod classic that has bout 31GB of orphaned tracks on that can be can be copied to the local disk.

I’m using the guide at

But when it comes to synching the music basic to the ipod, things get nasty. So far I have tried Banshee (crashes with content sync), rhythm box (possibly the worst designed interface of any application, sorry, but it’s awful); amarok (really confusing interface and can’t get past empty ‘transcode’ dropdown list before initializing the ipod and hard to tell whether it’s actually doing anyting); gtkpod (the less said the better, can’t detect the ipod); floola and yamipod packages don’t exist.

Of the bunch, Banshee is the only application that actually makes an attempt to sync content to the device but appears to have just too many errors before the segfault.

Even with all these problems, at least the device is actually detected; my Android phone is a complete blank in Antegros despite lots of MTP shenanigans. Given the number of options listed on the help pages it seems fairly obvious that this is all wing-and-a-prayer stuff.

I’m not saying that al this should be easy, but it shouldn’t be this difficult.

Update: by one means or another, gtkpod managed to copy 74 songs to the ipod. And having made it writeable, Amarok has a sync option to just copy the files rather than transcode them (which is what does for Banshee); all looking quite promising.  But I really wish I knew what it is that I have done to get to this point: it’s all well and good Amarok grabbing the lyrics to songs, but I do wish it had a progress bar telling me how far through copying the 3100 songs it has got..

Enabling webcam on Fedora 20

Decided to try and be brave with grabbing a photo to upload to the address book.

dmesg was reporting the following error:

[ 12.101118] uvcvideo: Found UVC 1.00 device <unnamed> (05ca:1839)
[ 12.101567] uvcvideo: UVC non compliance - GET_DEF(PROBE) not supported. Enabling workaround.
[ 12.101942] uvcvideo: Failed to query (129) UVC probe control : -32 (exp. 26).
[ 12.101945] uvcvideo: Failed to initialize the device (-5).

A quick G-search for ‘uvcvideo sony’ turned up,, and after installing the libusb-devel.i686 and glib-devel.i686 (‘m on a 32-bit laptop) followed by,

r5u87x-loader --reload

does the trick. Installed cheese to grab the image from the webcam.

Note, however, that the following would probably have been a bit simpler,

# yum search uvcvideo
Loaded plugins: langpacks
============================ N/S matched: uvcvideo =============================
libwebcam.i686 : A library for user-space configuration of the uvcvideo driver