Introduction to time-boxed programming
I'd like to just take a brief moment and introduce the reader to the concept of time-boxed programming. It's not a seriously hard concept to utilize or understand, but it is an interesting one. Basically, given some end goal or implementation, you set an amount of hours you'd want to spend on a project, or, given only a certain amount of time to complete a project, you have a seriously real deadline with time as the utmost constraint. Where do you start with a greenfield project, with a time constraint, and no clear direction?
Most people will start in the familiar, and in that same vein, this website was born of Python, Flask, and markdown.
The time-boxed development in this instance came about due to my partner and I heading out to PyCon in under two weeks, and I wanted to have something out on the web for people to visit. I wanted to give more context about my creative and engineering abilities, since my personal GitHub is somewhat sparse (most of my commits are to an enterprise GitLab instance, on side projects there).
Beyond my selfishly silly desires for exposure on the web, aside from LinkedIn, Facebook, Twitter, or Google+, I had been reading a fantastic book, Soft Skills, by John Sonmez who runs Simple Programmer. After hearing John on Michael Kennedy's podcast, Talk Python to Me, I picked up John's book and went to town parsing through its contents with each chapter echoing many of my own beliefs, eventually getting through nearly half the book in the first week. One thing John highly recommended was to start a blog, and get to writing, no matter what. This was again reinforced by Jesse Davis, who also spoke on Talk Python, advocating for the same idea - developers should have a blog and share their ideas (like this one).
All the evidence was pointing to me needing a blog, and so I went to creating it. Python, Flask, and markdown were the language, framework, and data persistence concepts I wanted to work with, given my time constraints.
For those interested in the technical details, I'm going to somewhat fly through these points, linking to the relevant libraries and references as necessary. If the pace seems a bit rapid, just know I spent a considerable time upfront (about 3 hours, out of 10) deciding to go the route I did: markdown, with a static flat file, nosql, as a data store.
This may cause a stir to those using PostgreSQL, or any other SQL server for their persistence layer, but to me, having the physical server right next to me, being able to SSH over to it and write content easily into the site sounded pretty cool. Also of importance, backing up a flat JSON-encoded file isn't too bad! The format is highly portable, and a quick 'scp' gets me a copy of the dataset. So what all did it take?
First, I'd like to thank Google for building Material Design Lite, a CSS library for building modern, auto-resizing websites with little barriers to entry and use. I've used this library in the past on some USAA internal tooling, just to get it live fast and not look ugly as sin (and trust me, some of our tools are grade-A ugly). Once I knew I wanted to use MDL, I started creating Jinja templates to incorporate the pieces parts of a layout, with a base template having many of the static content links, script loads, and stylesheet loads, while the specific pages had to have their own css classes injected to give the appearance of interactive divs, akin to buttons or the like. Some folks may find Jinja templating, and templating in general, to be mostly disgusting and a terrible implementation, but for something this small, templating is a great tool. If my site grows in the future, I definitely will gut the frontend and replace it with something like React or Angular.
So now that I've covered markdown, let's move to Flask. If you're unfamiliar with Flask, or haven't
pip install flask yet, I highly recommend you give it a go. Flask doesn't get in the way when it comes to programming a small API, having a succint and ease to navigate syntax, cleaning up the API for the underlying Werkzeug operations. One of the great things about Flask is that many talented individuals have written volumes on setting up websites with Flask, including Explore Flask, a high-level reference guide to Flask, funded by the community, as well as the Python Flask Mega-tutorial by Miguel Grinberg. Suffice it to say, Flask is awesome, and you should be using it in some capacity. Pyramid and Django are great frameworks too, though their featureset was far greater than I needed at the time of creation (any my familiarity too low, admittedly).
Finally, as for the choice of Python, this was really a no-brainer for me. The brevity of the syntax, and the 'batteries included' motto really works for most of my needs, especially for a blog site.
Something I've spent a long time wrestling with are data structures and their complexities. I've in the past coded up a few arbitrary data parsers for JSON and XML, in order to parse message entities off requests, without the need for JAXB-like bindings, lightening the already-heavy class load in Java; however, I also learned about the insane number of edge cases that could occur given such a tool. In the end, for Java, I ended up preferring XML as a data format, even though JSON was more user friendly and easy to parse, so in the end, the option was always left open to the user which format they wanted.
For Python, the natural choice is JSON. It fits the data model well, and the facilities for serializing and deserializing are pretty good (again, the 'batteries included' mentality of Python is excellent here). Python will read in JSON data either from disk with .load() or from a string with .loads(), and performs similar functions .dump() to output deserialized JSON to a filepath, or .dumps() to deserialize as a JSON string into a Python object. These conversions are all down based off the standard library's JSON conversion table in use by JSONDecoder. Most often, the string convenience serializers and deserializers are my go-to, since the file constraint has other consequences when it comes to testing out code. 'Surely, with such great support in the language, it is natural I just use that as a data format', I thought to myself.
What does that mean for the data structure? Well, I'd need to bucket my posts according to what they actually correspond to, so far only having 'project' and 'blog' in the data file. This would allow for me to use standard practices such as list comprehensions over the dataset to aggregate whichever subset of data I am trying to retrieve, whether it be blog posts, or project posts, or whatever 'type' I wanted to add in later (maybe 'recipe', or 'group buy', etc). This makes my data highly flexible, and down the road, if the size of the data structure becomes unwieldly, loading this dataset into couchbase, I'd already have a possible index to work off of.
Keeping the dataset unified, an ID was also necessary to keep things orderly. In this way, the dataset flows in one direction, with the ID augmenting as the dataset expands. Now, if a user tries to go to a blog post url whose post ID 'type' is 'project', it will in fact return a silly error page, indicating you've gone somewhere you shouldn't have, 'No blah exists with blah ID blah!', which is a simple check on both the 'id' and 'type' fields for validation. I'm fine with managing these endpoints as they are configured parameters on the data objects I'm storing out - each is loaded dynamically into the template via this JSON object structure I've been describing.
The beauty and elegance of this data structure is that, say, I want to add some new field to the templates and feed it in on my JSON object. All that's needed for this expansion is an additional field on my stored object, to be picked up by Jinja. This is done by exposing the whole Python object to the Jinja templating engine when calling
render_template() from a endpoint in Flask (like this), instead of passing just a piece of the object to the templates. Either approach is fine, but having the additional parameter on
render_template() with the entire object is really not too harmful and extremely useful in these instances. For now, I am selectively exposing the full object to Jinja, depending on the needs of the page being rendered.
Other data considerations
When I set out making the website, I didn't really consider one fact, that, in hindsight, is actually a huge benefit. If I go about loading up data in this way, in a static-content-like structure, my website will be incrementally faster to the user as they navigate the pages, given the right expiration on content is set in the HTTP responses. After publishing a post, it is highly unlikely that content will change much over time.
For the same performance concerns, the MDL libraries and fonts are self-hosted next to the content (I claim no ownership), so that the client isn't waiting on requests from Google's server for the page to load, reducing 'drag' on the client's computer (or smartphone, or tablet). This also helps me 'freeze' any changes that may occur in the library that I'm not preparing for, though the library has been very stable since I've used it (poor Angular users... those major releases in 2017/2018.
Nuts and bolts
Now, I wanted to touch on this topic a bit, because I've seen at my current employer, some level of skepticism from my coworkers on how this stuff works, this software stuff, at the hardware level, as if a memory leak is some strange one-off they can throw out as an anomaly.
To get my self-hosting off the ground, I looked into multiple options. My first attempt was a server-like desktop build, that ended up failing horribly (however, it turned out to be a great computer for my partner). None of the hardware drivers would work easily with the VMWare ESXi virtualization software I wanted to put on top of the server, and I'm telling you this so you don't make the same mistake: ALWAYS CHECK THE SUPPORTED HARDWARE LIST AND DON'T BE A HERO! Those lists are meticulously currated by vendors for enterprise-grade systems, and should be adhered to, else, you're in for a wild ride of hand-rolling your own driver files and ESXi bundle - not too fun I'm afraid!
What I ended up going with was an HP Gen8 ProLiant Microserver, a server with a low physical footprint, quiet operation, and the nice tools that come with enterprise-grade servers, including remote server management. I chose HP over Dell or Lenovo due to the fact that their line of servers seemed to be the most unified in architecture and design principals, with the Gen8 server line being extremely modular and hot-swappable. My intent, too, is to eventually procure a larger HP Proliant blade, in order to stand up a more flexible home lab environment, as self-hosted OpenShift Origin being the main motivation (CPU and memory consumption on the microserver was too high for anything feasible to be done on the platform, on 1 CPU die / 16Gb ram).
If you go the physical hardware route, with a Linux distro, be sure to set up your user permissions correctly, limiting wsgi agents to the appropriate folders for read/write permissions. Failing to be thorough in this setup process could lead to bad things down the road, for the web is dark and full of terrors.
This set up process, for me, was not entirely smooth.. Historically I'd gone with Apache and mod_wsgi for setting up Flask, and for this site, I wanted to give nGinx and uWSGI a go. To get this set up, I'd need to first learn the new directive structure for nGinx, which in hindsight, really wasn't all that bad. The tricky bit was that I had limited knowledge administering my own server, setting up system services, delegating access and roles, and getting the DNS set up correctly. I'll spare this post the details, but it is quite a process, and each flavor of linux seems to have its own pitfalls, so be sure to do the proper research. Nothing truly will prepare you for implementation, until you fiddle around with actually doing the set up yourself, however
Onward and upward
So here you are, reading this post, consisting of a giant blob of markdown, ram-jammed into a Jinja2 template after parsing. The site, in total, consists of less than ten templates, with one base, one for each main page, and one for each type of 'post', totaling really only 5 or 6. This means that going forward, extending this is pretty straightforward, and writing posts is even easier. Getting Flask up and running with this markdown approach was easy, while getting the server set up was the biggest chore, which might be a message to the sysadmins of the world, and the system designers of the world - why'd you go and make so many different decisions apart from each other? The fact that PaaS organizations have made this set up process easy and scalable with tools such as Puppet and Ansible is a huge feat and a nod to some great automation engineering, and that'll be my next step - automating the boring stuff, saving a 'pure' image out to my VMWare store, and having that just in case something happens to the image I'm currently using. Now that the basics are set up - a way to construct the nGinx files, a way to add uWSGI configuration files with ease using the uWSGI Emperor, and this basic application as a prototype - creating new web apps will also be straightforward.
Thanks for reading this blurb and I hoped it was insightful! If you're a programmer without a blog, I encourage you, just as John Sonmez encouraged me, to get going on building your site and sharing your development experiences.