Alas, A Website

Automate the Docker DNS Pain Away

2023-07-31T00:00:00-04:00

Like many other nerds, I have somewhat of a homelab at home. These days it’s not as complicated as it used to be, consisting largely of a big “NAS”¹, a Home Assistant box, and a couple other small things.

The NAS, being a beefy server machine, runs a bunch of Docker containers for various things — Octoprint, Docspell (which runs 2 of its own containers + Solr), etc.. It also runs NixOS², and all of these containers are fronted by the Let’s Encrypt and Nginx infrastructure that it provides. To avoid exposing ports on the NAS’s “host” network, I point Nginx virtual hosts directly at container IPs, like so:

services.nginx.virtualHosts."octoprint.coderinserepeat.com" = {
  enableACME = true;
  acmeRoot = null;
  forceSSL = true;
  locations."/" = {
	proxyPass = "http://10.0.1.1:80";
	proxyWebsockets = true;
	extraConfig = "client_max_body_size 0;";
  };
};

This generally works, except…one of the things that NixOS does as part of a nixos-rebuild --switch when using the declarative Nginx configuration is an Nginx configuration check. Normally, this is great: if I screw up the configuration somehow (e.g. injecting some bad configuration), it won’t take down Nginx. However, it has a big downside: if containers are restarted/container configuration changes, assigned IPs are not stable³, and Nginx configuration will fail to validate.

Previously, I’d tried a number of things that purported to provide a Docker <-> DNS translation, subscribing to Docker daemon events and running a DNS server that I could point other things at. In practice, this never worked quite right: despite telling systemd-resolved that the dns-proxy-server container should be used for DNS, rebuilds (and thus Nginx config checks) would frequently fail because the upstreams would fail to respond on the proxyPass ports.

I was about to embark on a “stupid scratch a homelab itch” project and write something that connects to Docker, listens for events, and updates Route53⁴, when Pete Keen suggested that I check out the docker-gen project, and then pointed me at dnscontrol as well. Sensing an opportunity to hit a Pareto optimal⁵, I set about hacking up some systemd services and a dnsconfig.js.tmpl file, and an hour or so later, had something extremely feasible.

For the purposes of this writeup, I’m going to assume that you already have the dnscontrol and docker-gen binaries somewhere on your system. In my case, they’re in /nas/homes/brajkovic/bin. I also assume that you’re using Nix/NixOS, because I didn’t write the units manually, but hopefully these declarations for the systemd units are simple enough to manually write the full unit.

The `systemd` units

`docker-gen`

First, the docker-gen unit — docker-gen knows how to run as a daemon and listen to events, so we can run it as a normal systemd service:

systemd.services."docker-gen-dns" = {
  path = [
    "/nas/homes/brajkovic/bin"
  ];

  script = ''
    docker-gen -config docker-gen.cfg
  '';

  serviceConfig.WorkingDirectory = "/nas/homes/brajkovic/.config/dns";
  wantedBy = [ "multi-user.target" ];
};

The working directory is where the docker-gen.cfg file lives, it’ll be in the next section.

`dnscontrol`

Next, the dnscontrol unit — in this case, we register it as a oneshot unit, because docker-gen will run systemd to start it.

systemd.services."dnscontrol-apply-docker.coderinserepeat.com" = {
  path = [
    "/nas/homes/brajkovic/bin"
  ];

  script = ''
    dnscontrol version
    dnscontrol preview
    dnscontrol push
  '';

  serviceConfig.Type = "oneshot";
  serviceConfig.WorkingDirectory = "/nas/homes/brajkovic/.config/dns";
  after = [ "multi-user.target" ];
};

The config files

`docker-gen.cfg`

This file configures docker-gen’s behavior, and is super simple:

[[config]]
dest = "dnsconfig.js"
notifycmd = "systemctl start dnscontrol-apply-docker.coderinserepeat.com"
template = "dnsconfig.js.tmpl"
watch = true
wait = "500ms:2s"

It tells docker-gen to source the template from dnsconfig.js.tmpl, write it to dnsconfig.js, and then run our dnscontrol unit as the “notify” command after it’s done updating the template. Setting watch to true puts docker-gen in daemon mode, and wait configures the hysteresis: it will wait at least 500ms, at most 2 seconds, to debounce changes.

`dnsconfig.js.tmpl`

The dnscontrol template, also deceptively simple:

var REG_NONE = NewRegistrar("none");
var DSP_R53 = NewDnsProvider("r53_main");

D("docker.coderinserepeat.com", REG_NONE, DnsProvider(DSP_R53),
{{range $key, $value := .}}
    {{if $value.IP}}
    // {{ $value.Name }} ({{$value.ID}} from {{$value.Image.Repository}})
    A("{{ $value.Name }}", "{{$value.IP}}"),
    {{end}}
{{end}}
    // Allow letsencrypt to issue certificate for this domain
    CAA("@", "issue", "letsencrypt.org"),
    // Allow ACM to issue certificates for this domain
    CAA("@", "issue", "amazon.com"),
    // Allow no CA to issue wildcard certificate for this domain
    CAA("@", "issuewild", ";"),
    // Report all violation to test@example.com. If CA does not support
    // this record then refuse to issue any certificate
    CAA("@", "iodef", "mailto:caa@coderinserepeat.com", CAA_CRITICAL)
)

This is mostly basic JavaScript, plus some Go template language — we emit all the A records for the Docker images, and some really basic CAA records so that we can issue certs if we need to for those DNS names⁶.

`creds.json`

The basic “credentials” file for dnscontrol:

{
  "r53_main": {
    "TYPE": "ROUTE53"
  }
}

This doesn’t actually have any credentials, because those are provided by the standard AWS SDK credentials mechanism — I should probably do something better with those secrets, but if you can either log into or physically steal my NAS, you’ve earned my AWS creds.

Wrapping It All Up

Putting all that together, we’re done. The docker-gen daemon runs, supervised by its systemd unit. When it needs to, it spawns dnscontrol, but it mostly just sits there idly — I had to restart it to get any recent output, and it said:

Jul 31 22:32:58 hagal docker-gen-dns-start[3691698]: 2023/07/31 22:32:58 Watching docker events
Jul 31 22:32:58 hagal docker-gen-dns-start[3691698]: 2023/07/31 22:32:58 Contents of dnsconfig.js did not change. Skipping notification 'systemctl start dnscontrol-apply-docker.coderinsepeat.com'

When I manually killed a container⁷, you can see the expected output when things do happen:

Jul 31 22:35:09 hagal docker-gen-dns-start[3691698]: 2023/07/31 22:35:09 Received event die for container 236c84adea38
Jul 31 22:35:10 hagal docker-gen-dns-start[3691698]: 2023/07/31 22:35:10 Debounce minTimer fired
Jul 31 22:35:10 hagal docker-gen-dns-start[3691698]: 2023/07/31 22:35:10 Received event stop for container 236c84adea38
Jul 31 22:35:10 hagal docker-gen-dns-start[3691698]: 2023/07/31 22:35:10 Generated 'dnsconfig.js' from 7 containers
Jul 31 22:35:10 hagal docker-gen-dns-start[3691698]: 2023/07/31 22:35:10 Running 'systemctl start dnscontrol-apply-docker.coderinserepeat.com'

The unit was indeed started, and you can see dnscontrol applies the changes:

Jul 31 22:35:11 hagal dnscontrol-apply-docker.coderinserepeat.com-start[3712349]: [INFO: Diff2 algorithm in use. Welcome to the future!]
Jul 31 22:35:11 hagal dnscontrol-apply-docker.coderinserepeat.com-start[3712349]: ******************** Domain: docker.coderinserepeat.com
Jul 31 22:35:12 hagal dnscontrol-apply-docker.coderinserepeat.com-start[3712349]: 1 correction (r53_main)
Jul 31 22:35:12 hagal dnscontrol-apply-docker.coderinserepeat.com-start[3712349]: #1: - DELETE dns-proxy-server.docker.coderinserepeat.com A 10.0.0.4 ttl=300
Jul 31 22:35:12 hagal dnscontrol-apply-docker.coderinserepeat.com-start[3712349]: SUCCESS!
Jul 31 22:35:12 hagal dnscontrol-apply-docker.coderinserepeat.com-start[3712349]: Done. 1 corrections.

Overall, really simple, and like I said, hits a strong Pareto optimal: an all-in-one solution would be cool, but bodging together some existing tools and a few systemd services provided satisfying short-term relief.

A Supermicro 6028U-TR4T+, with dual Xeon E5-2650L v3 processors, 128 GB of RAM, and all the disks I could cram in. I expect it to last me…a damn long time. ↩
Which was in some ways a mistake, and in some ways really speeds things along. ↩
Manually managing IPAM for Docker containers is not my idea of a good time: see the aforementioned mild regret of using NixOS — I don’t actually want to spend that much sysadmin time. ↩
Where my domain is hosted, but likely it would have ended up supporting pluggable providers, because I can’t build anything without overbuilding it. ↩
80% of the desired outcome, 20% of the work. ↩
Not that we need to — the day-to-day records that I use live on the root domain. ↩
Just for fun, but I do need to cut this container out of the configuration for good. ↩

Knowledge management in the modern era

2021-03-20T00:00:00-04:00

A few things have happened recently that have gotten me thinking about what the broader scope of knowledge management (going beyond blogs and personal wikis, casting a wider net into all of the data generated by a person’s life) looks like in the 2020s:

This thread with my friend Joel, in which I riff on what I think a modern version of Vannevar Bush’s Memex would look like, and how it compares to some existing tools out there (Microsoft’s ToDo/OneNote suite + the general internet). I was originally introduced to the Memex by Charlie Stross’s Laundry Files series, and was instantly fascinated by its presentation as an electromechanical contrivance that when imbued with sufficient magic granted the user’s “data” access/perusal speeds almost equivalent to “main memory.” Since then, Vannevar Bush’s “As We May Think,” which introduced the Memex (as well as several other interesting ideas about how we capture and access our life’s data), has been an influence on both my thinking and on this writing.
I think a lot about knowledge management at work. It’s not a matter of secrecy that the bank I work at is currently in the midst of a CORE transformation¹ project, nor is it a matter of secrecy that it constitutes about 80% of my work product these days. In particular, a project like this requires, as a matter of course, an enormous amount of documentation to be produced, and knowledge to be transferred: knowledge about the workings of the existing system, knowledge about how the new system is configured, and documentation about how the business processes translate from one system to the other. All of this has to be produced, organized, made accessible, and made searchable. More importantly, though, it needs to be consistent (all in one place) and versioned. The ability to visualize business logic change is paramount when dealing with such fundamental systems; every change at the core banking system level has 2nd, 3rd, and maybe even 4th order ripples. This is especially true when business process revision takes place at the same time as development work.
I scanned another batch of paper documents. These currently end up in a folder with hundreds of other scans, only loosely organized via filenames and some directory structure. The structure is mostly imposed by important events that generate significant paperwork that needs organization: real estate transactions (2 house purchases, 1 house sale, 1 refinance), each year’s tax season, etc. The filenames alone are helpful, but my ability to look up a document is still nowhere near matching my ability to recall metadata about the document. Even when I find a scan that I’m looking for, I’m still missing the full context at my fingertips: I can’t find related documents easily, I can’t find related email easily, etc. Email is particularly important, especially with “paperwork generator” type events—much of the context behind a document still lives in that format: “Why did we have the lawyers draft this agreement to time out after 90 days versus 60 days?”, for example.

The common thread between all of these is actually fundamentally simple: in each occurrence, in each data retrieval context, the ability of my brain (and other brains…probably) to categorize (“I want all vet bills from 2019,” “I want all checks paid to my wife from June 2016 through July 2018”) and to conjure up the “metadata” that I want to index on far outstrips the ability of software systems (and physical systems—it’d be nearly impossible to do cross-reference to the level I desire with physical documents). Bush’s Memex, and its influence on early hypertext systems (Engelbart’s MOAD, for example), seemed to predict a future where our memory, with its prodigious capability to categorize and cross-reference, would be supplemented with computer systems that were organized the same way. Yet here we are in 2021, and all the systems I’ve tried recently still focus on the sterile familiarity of a filesystem-like layout, mired in the land of directed acyclic graphs. Why?

Was this really predicted in the 1940s

Yes. No, seriously. Go read “As We May Think,”², and notice that in 1945, Bush says this, at the beginning of the 6th major section³:

THE HUMAN BRAIN FILES BY ASSOCIATION-THE MEMEX COULD DO THIS MECHANICALLY

Notice then that he has condensed my common thread from before into a single, pithy sentence: “The human brain files by association.” For the rest of the 6th section of “As We May Think,” Bush takes the opportunity to point out that we’ve created artificial sorting and hierarchy to place data in storage, when in reality, our mind does not work that way—our ability to binary search sorted, hierarchical organizations of data to find a single document pales in comparison to the speed at which a person is able to free-associate metadata⁴:

Yet the speed of action, the intricacy of trails, the detail of mental pictures, is awe-inspiring beyond all else in nature.

It is in this section that Bush lays out the “memex”—predating common use of the word “meme” by 30 years, he chooses this as the name of his memory-enhancing devices. The memex sounds an awful lot like a “battlestation” setup of the 202Xs: multiple screens, multiple control planes for various tasks, etc. However, it’s biggest strength? In addition to allowing its user to enter an endless stream of un-or-lightly organized data, it offers the ability to cross-reference, to page and skim through data at amazing rates w/ mechanical assistance (and probably some mental training, in skim-reading and categorization) and most of all, it supports the sort of serendipitous interaction that is a key part of searching a pile of documents, putting together threads as you go.

How do wikis fit into this?

It didn’t take long after Ward Cunninghan’s 1995 WikiWikiWeb for wikis to emerge as the leading approach to knowledge management—early third party implementations of Cunningham’s wiki concept emerged in the few years following (TWiki in 1998, MoinMoin in 2000), and the concept gained ground quickly⁵. Mediawiki’s initial release in 2002 and the success of the Wikipedia project, and the release of Confluence in 2004 into the enterprise space (as well as its subsequent success), really seemed to solidify the idea of the wiki as something that was here to stay.

Over time, wikis have built up a rich set of features that looks like it corresponds quite neatly with our stated desires: good abilities to categorize and tag content, a rich markup for capturing notes and snippets of text, an ability to contain more-or-less any kind of content (with limited degrees of usefulness), and a flexible storage approach that isn’t explicitly tied up in outdated filesystem notions. However, despite a seemingly solid foundation, even the latest installments of wikis fall short when it comes to being an augmentation of the human intellect.

Failing #1: Poor support for non-plaintext knowledge

Wikis are, by nature, built for plaintext knowledge—they’re intended to present textual documents, and serve as a knowledge repository for data that could be principally described as documentation. Their evolution has led to primary use as encyclopedia-alikes, whether that’s the public Wikipedia, enterprise/corporate wikis (I’ve seen Mediawiki, Confluence, home-grown wikis, even GitHub’s Gollum wiki in this space), or personal wikis (most frequently used for stuff like “The plumber is John Doe, his number is 555-123-4567,” which I like to refer to as “operational knowledge” for one’s life).

However, by focusing on text, wikis have more-or-less no document pipeline to speak of. Attachments are possible, but there’s no processing or indexing pipeline‐if you want to reference the content of, say, a bill PDF, you need to:

Extract the content into a page
Design a format for it (bills are generally easy and tabular, but not everything will be)
Do the transcription

This is pretty time consuming, and the UX still isn’t great in cases where the content of the attachment is not trivially transcribable to plaintext. This falls down just as hard when the data is textual, but not plaintext—for example, if you want to store and index an email thread (see the first section of this essay, where I talk about “paperwork-generating events”), you have to at a minimum copy all the contents out. At a maximum, you must separate the messages, build a hierarchy for the thread, come up with the appropriate categorization, etc. The depth of the process, however, is directly proportional to its final utility: a single page w/ a copy of all the contents is less broadly useful than each email as a discrete, indexable, and referencable document.

A system more attuned to the needs of different document forms would recognize an email thread and categorize appropriately, recognize a PDF and apply OCR/text extraction as needed, perhaps even recognize images and attempt to make guesses at their contents, and use those to enrich the search index.

Failing #2: Poor taxonomy/organizational UX

Another place where wikis fall down is in their taxonomy & organization UX. While most of them are able to be expressive in their taxonomy, the actual presentation of the taxonomy always seems cast by the wayside. Rarely is the presentation up front—generally speaking, the wiki expects you to enter via a landing page, which itself links to other pages. Rarely is the organization of pages up-front—you’re expected to maintain category pages, rather than being able to use the category/tag itself as an entry point.

Consider the following case: I know I have a note/document somewhere that has information on the plumber that I use, and I want to look up the plumber, and also see how much I’ve paid for various services via invoices. In most Wikis, I have to either look up the “Professional Services” page, which then has the list of documents, find the Plumber, and look up the bills from there. But the professional services page has to be linked from the landing, and maybe has to be maintained manually. A system with a more up-front focus on taxonomy would give me an entry point to explore categorization, or even give me a direct input into filtering the knowledge set using the predeclared taxonomy.

Organization-wise, wikis also tend to fall down because of their semi-global namespacing—Confluence in particular has this problem, where if you want to express the same structure multiple times within a single “space,” you are completely barred from doing so! You have to either do contortions like [Repeated Name]: Foo and [Repeated Name]: Bar, or create multiple spaces. Mediawiki has more or less the same issue—namespacing is possible, but ugly (Foo:Bar), and still doesn’t solve the repeating structure issue in a meaningful way.

How do document management/enterprise information systems fit in?

As part of exploring the space, I’ve also looked at more pure document management/enterprise information systems. I’m passingly familiar with Perkeep (f/k/a Camlistore), which I ran briefly at home, and Hyland’s OnBase product, which my employer uses. I’ve also looked at others, like DocumentCloud, PaperSave, etc. These systems have an area of strength, and that’s focusing on archival and document lineage—you get a data lineage, some classification, integration with business workflow software in the “enterprise” cases. These features have value as an overall component of knowledge management—nobody actually enjoys losing data, and knowing the history of where things came from is useful.

However, their utility as overall knowledge management systems is lacking. They tend to focus on hierarchical organization, and have limited cross-referencing/tagging/categorization capabilities. As well, they tend to be focused on discrete pieces of data—scanned documents, images, etc. While this is useful for probably 80% of what knowledge management should be (especially in enterprise settings, where executed contracts, etc. are a major generator of paperwork and need to be available for rapid lookup), it still doesn’t help capture “flotsam” that is generated by day-to-day activity. Even more usefully, they don’t capture the dynamic, living notation that a wiki does, and are minimally helpful in cross-referencing the two. Sharepoint is really a shining illustration of this—the document upload functionality, the associated versioning, and search within those documents are all pretty good. However, it really falls down because it’s pure hierarchy, and its support for “wikis” is one of the most mediocre pieces of software I’ve ever seen.

One place enterprise-targeted systems also deliver when it comes to search. SharePoint excels in this space (I’ve seen it extract useful, searchable information from Office format files, PDFs, etc.). A piece of software in this category I took a test drive of as part of researching what I wanted in this space was Kofax’s PaperPort. PaperPort was appealing to me because of the promise of scanning to PDF, capturing on-the-go, and “transforming paper documents into actionable digital information.” It turns out, it is actually quite good at some of these—its scanning integration is excellent, and it does a good job at OCRing and searching scanned documents (or extracting text from pre-OCRed documents). I loaded a few documents from Tax Season a few years ago, and was able to quickly find them among the preloaded demo library by searching for keywords that I knew would be tax specific. However, PaperPort fails in the same way that most document management systems do: its categorization and organization approach is catastrophic. It remains mired in hierarchical patterns—this is probably OK for enterprises, but doesn’t work with my approach of multi-categorization.

What do I actually want in a knowledge management system?

Naturally, as part of writing this, I’ve thought a lot about what I actually need in a knowledge management system. Much of this, I think, will be applicable to anyone looking to organize their knowledge. Some of it is probably idiosyncratic to me, or at least not broadly applicable to everyone needing knowledge management. Even those places, I think, offer readers a chance to maybe rethink the way they store, think about, and search their extended memory. Of course, I’m also coming at this from a “maker”’s perspective—I build software for a living, so naturally, I’ve spent a fair amount of time also thinking about how I would build this.

Storage is obviously critical

I mentioned above that many document management systems have a principal focus on archival-quality storage—Perkeep explicitly states in its mission that “Your data should be alive in 80 years, especially if you are.” I don’t know about 80 years, but I definitely don’t like losing things. The good news is, I think storage is largely a solved problem. There’s probably room to argue on this, and it’s going to be dependent in some part on where your data is held. People running their own storage infrastructure (NAS, backups, etc.) probably will disagree a little bit that reliable storage is a solved problem.

While I have network-attached storage at home, and make extensive use of it, I would probably choose to build this system on the public cloud. “Blob” stores are all capable of doing this job quite well, have reasonable costs, and provide value in the form of some of their default features on top of just storage. For example, most of them have a low-complexity versioning concept that would be sufficient to satisfy my desire to version knowledge where it makes sense. AWS S3, Azure Blob Storage, GCP Cloud Storage are all satisfactory. Backblaze’s B2 might be a dark horse candidate, due to its cost and Backblaze’s relative pedigree.

I’d probably use S3 anyway and eat the extra cost—I don’t have enough data to store for the cost to make a difference, and I’d probably build the rest of the system on AWS as well. If I were incentivized to make this distributable, or meaningfully multi-user, I’d either use the filesystem (with some tricks to reduce disk use and present data closer to the system’s “mental” model⁶), or automate separate cloud enclaves for each user of the system. Some sort of password-derived KMS keywrapping, or something. Honestly, I haven’t spent a lot of time needing to build systems w/ user-generated content that absolutely cannot be commingled, so I’d ask people smarter than me. I’m willing to hand-wave away some things—at the end of the day, I’m more focused on the taxonomic & UX portions than on storage.

Support for all types of media is another must

The system has to support all kinds of knowledge, and support them naturally. Free text (basically, wiki pages or other scratch notes), PDFs, emails, etc. should all garner the same level of text-search support. I should never be in the position where I’m having to build my own canonical representation of some piece of data, when I already have it on hand! The email thread case from above I think is especially illustrative, as it’s an area where existing system probably fall down quite hard (except for maybe eDiscovery). I would make an exception for images, where I would expect (and prefer!) to enter descriptive text around the subject of the image—as far as AI/ML systems have come in identifying the contents of an image, they don’t capture any of the surrounding context.

I’m not sure about how to best deal with audio/video data in this context, to the point where I’m almost tempted to leave it out entirely, but that feels wrong. I think a first cut has to approach it in the same way we approach pictures: free tagging and free “description” entry, with the description entry being indexed for search in the same way all other text is. A possibility is to transcribe the audio track where available, but I’m not sure about the computational price of that.

Versioning some of this data might be a challenge as well, and I think for the sake of reasonable constraints, I might choose to only version user-generated, wiki-style content. Presenting “Track Changes” style version management of Word/Excel/etc. documents might also be a possibility, if that data can be easily extracted, but since those are self-contained within the document, I don’t think they need storage assistance. In general, I find that Track Changes on anything other than Word-style docs hard to understand—versioned spreadsheets should probably be database tables, or at least Wiki pages where a diff is a little easier to render and understand.

I think there’s really not much of a verdict to give here—all data is beautiful and deserving of our attention. As to tools, I’d probably reach for Apache Tika for text extraction from various formats, and maybe ffmpeg or similar tools for basic metadata extraction from audio/video. I’m not aware of any computationally inexpensive approaches to audio transcription to complement human-entered description for audio/video tracks, so that would be a good research area.

Organization and search

Obviously, this system needs a really good approach to taxonomy, a robust metadata capability, and a search capability that combines taxonomic searches, content searches, and metadata searches. I think the system needs to distinguish between a few different types of searchable data:

Well-structured metadata, intrinsic to the piece of knowledge. Things like when was it entered into the system, when was it created, its original file name (distinct from the “name” within the system!), etc. Depending on the input, this intrinsic metadata extraction might bleed into what would otherwise be extrinsic. For example, email-type input would probably generate some taxonomic data automatically.
Structured, but extrinsic data. This is really what I would call “user-entered” metadata. This includes things like “when was this piece of knowledge created,”⁷ the taxonomic classification, user-entered related document linkage, etc.
Unstructured data, both intrinsic and extrinsic. This is the actual contents where they can be extracted (text, OCRed text, transcribed audio, etc.), or the human-entered descriptions where they can’t (photos, videos, audio).

The taxonomic data itself has additional constraints. I know I’ve largely railed against hierarchies in prior sections, but I think there is a utility to them at one or two levels of depth, as a complement to other taxonomic approaches. I’m still firm on my belief that pure hierarchical organization does not work. Two things I would want to see in terms of taxonomic approaches:

At least one level of high/top-level kindedness. These should be free-form, though. A naive approach would leverage data “kind” (images, documents, etc.) at the top level, but I think that again, flexible is better here. Especially in a business setting (and even in a personal one), you likely want top-level categories for things like check images, invoices, etc. so that you can separate quickly on broad-but-not-overbroad strokes. Nothing stops you from having multiple breadths of category, either—an intrinsic kindedness category based on the data kind, and an extrinsic category assigned by the user.
A free-tagging second layer of taxonomy, that allows for some natural hierarchicalization. I’m not yet sure whether this should require all layers to be represented, or simply infer them. The way I’m thinking about this is for, say, checks, you want to represent the from and the to as separate tags, so that you can express searches like “All checks from Pablo to me,” or “All checks from me to my general contractor.” The layers question boils down to whether foo:bar implies foo or not—my lean is that yes, it does.

Technologically, I’d take a “progressive enhancement” approach to storage. The intrinsic, well-structured metadata I think could go into a relational database, along with notions of ownership, access control lists, etc. Structured extrinsic metadata goes here as well, I think—most searches over it will be “whole word”-type searches, that don’t need the sort of “nearest neighbor” search approaches that unstructured data needs. For the unstructured data, I think for small data volumes, something like PostgreSQL’s GIN indexes + full-text search will be sufficient. At larger data volumes, I’d probably reach for tools like Solr, Elastic, or custom work on top of Lucene. At sufficiently large volumes, I’d look for an exit via Google acquisition, so they can apply their search work.

Presentation & UX

Presentation and UX of the system are going to be just as important as the storage and taxonomic layers. There’s a number of problems with existing systems, but the two key ones for me are the process of loading data, and the “views” into the data. This will probably be the longest section of “what I want to see,” because it’s by necessity the most complex—without the benefit of a user experience, the “academic” storage and taxonomy qualities of the underlying system are largely moot.

Loading data into the system

Probably the most important part of the day to day interaction, actually bringing data into the system is an area that has to be more fluid than it is today. My biggest issue with the systems I’ve tried so far, is that I have to enter each piece of data individually, and no system I’ve used has ever shown me what I’ve just entered. This means that anytime you need to batch-load data into the system, you have to inspect each piece by hand, upload it, enter the metadata, and then go back to the next item. This batch upload with previewing is a key UX point, because it makes the ingestion of data fast and natural.

Otherwise, there’s a few key entry points for new data:

Drag-and-drop/file select + free entry of text from a web UI. This is pretty standard, and probably the primary mechanism for a lot of entry. Certainly “paperless home/office” scans, DSLR photos, bills, etc. would probably come in that way.
Wiki style, free-text entry. Good for notes, research work, documentation, etc. Pretty foundational for both personal and business use—personal wikis are a great way to store tradespeople’s information, work histories (“plumber fixed blah on blah, refer to invoice blah”, etc.). Wiki pages need at least basic cross-linking functionality (to each other, and to other data), but otherwise I think should probably be fairly vanilla Markdown (or other structured text) documents.
Mobile apps are a definite nice-to-have. Since all of the “built in” frontend (file upload & wiki editing) should be API driven, building in mobile support should be possible. The biggest benefit of this would be “share sheet” integration in the mobile OSes.
A neat trick for ingesting email would be to support forwarding to an assigned email address. Most, if not all, of the popular SaaS products for sending email also support this use case for receiving it. Ingesting mail this way preserves as much of the original metadata as is possible, and is convenient to boot.

Viewing the loaded data

Because most systems are tied into their existing notions of hierarchies, they get this wrong when their first view is purely hierarchical. In reality, there needs to be a number of different entry points, that represent different elements of the taxonomy.

Going back to the description I gave above, I suggested two layers of taxonomy: categories, and tags. In this system, each of these taxonomic layers has its own entry point—tags and categories, as well as their associated hierarchies, each have their entry point. The knowledge/data-type hierarchy also has a distinct entry point—even though it’s not a principal point of organization for most of our purposes, it can be a useful entry point, especially combined with the extrinsic taxonomic layers. This means that a “home” view needs to present all of these different approaches to diving in, as well as a “search” entry point.

The search itself should aim to be natural, combining metadata searches with full text searches. The search language should, whenever possible, try to make use of natural thinking/speaking patterns to formulate searches. An interesting idea I had in this space is to utilize the natural spoken differences between , and ; to determine disjunction vs. conjunction. Owing to the shorter pause afforded by ,, it becomes conjunctive: “Invoices, 2018, tradespeople” gives you all invoices, paid in 2018, to tradespeople. Meanwhile, ; becomes a disjunction: “Invoices; Checks” gives you all documents matching (invoice OR check). There still remains ambiguity on whether ; is inclusive or exclusive—while a fascinating topic, it would be a major departure from this post, and best saved for other writing.

Another potentially useful view on the loaded data would be a “sync” view to the local system, especially when paired with primary storage in the cloud. I mentioned this a while ago, and gave it a brief footnote, but I think it deserves a little bit of extra coverage. The idea is to synchronize to a filesystem, and give each element of the taxonomic hierarchy its own folder, and use links to reflect the fact that a document lives in multiple places, without duplicating storage use. This type of synchronization would be especially useful with “media”-type documents, “paperwork generating events,” or things like pay stubs, where there are times that you want to just get a copy of everything associated with a particular taxonomic element.

Statistics and analysis

I’m actually not sure it makes much sense to do any sort of statistics or analysis on the stored data. None of the obvious candidates are particularly interesting—who cares about file size, word count, etc.? Because our data is massively heterogeneous, it’s hard to do much primary key analysis/subject analysis in coherent ways. Maybe with a consistent format for certain bills, you could do some analysis, but I don’t belive it’s worth it to do so in this medium. Another concept I’m going to introduce, “workspaces,” I think would provide a foundation for collecting the data in order to analyze it in a more suitable suite of programs.

“Workspaces”

The concept of a workspace here is one that’s really useful for research-type activities. Workspaces are intended to facilitate research threads, like someone might use for research/planning on a novel, or for geneological research. They can be populated manually, via search, or via a search, like “smart search” folders in many email clients. The goal is to tie together cross-referencing ability, display, and note-taking in a way that existing tools like OneNote and Scrivener don’t.

They’re functionally not that different from a single taxonomic point view, but I see them as a potential extension on top of that view that provides some ability to multi-view, take notes and have them be automatically stored w/ relevant “related document” links, etc. A workspace is what I would have used while researching this post (for example, I re-read “As We May Think” and read most of Engelbart’s “Augmenting Human Intellect”), and I would have taken notes on each of the papers, maybe even kept the blog post draft in the same workspace (remember, “wiki” style pages are just Markdown!).

Mental models and techniques for this system

This is an area that I struggled to write, because there is an element of idiosyncrasy to this. Part of what makes this system suited to me is that it is inherently compatible with my mental models and approaches to memory. The biggest thing is that none of this is designed to require memorization. Quite to the contrary, it is designed to aid you in not having to memorize prodigious amounts of information—you shouldn’t feel the need to resort to any memorization techniques or practices in order to use this system. What you will find helpful is the ability to free-associate, and to explore drawing “conclusions” quickly. In general, my brain seems to be tuned for rapidly making connections between various pieces of information, and that would be principally helpful in composing searches in this system, as well as exploring related documents.

However, even that is probably limited in its utility, as the system is somewhat designed to help you build up that ability to rapidly associate between information that you already know. After all, the taxonomic systems, intrinsic metadata, and content extraction are designed to work with the little bits and pieces that you do remember, and help you find the entire document (and all the related documents). Fragmentary pieces such as “a tax document from 2019” should be enough to reduce the search space to something that you can brute-force. “a tax document from my employer” should be sufficient to narrow down to the exact document or handful of documents that you need.

In fact, the most useful technique and mental model for this system? Get yourself in the habit of committing every potentially useful piece of information/context to the taxonomy. The more richly you describe data when loading it into the system, the less of it you have to remember.

Conclusion

Whew. We’re a touch over six thousand words in, and I’ve finally reached the end. My broad conclusion here is that most existing systems for knowledge management fall short, in fairly predictable, consistent ways. Each of these ways is intrinsic to the “type” of system! You could almost say it’s each category’s nature to fail in one of these ways:

They’re too oriented toward “business process” integration, and don’t support the capture of evolving knowledge, except when it can be locked into their hierarchical, discretized model. These are your enterprise-y document management type systems (PaperPort, OnBase, DocumentCloud, etc.). They sometimes deliver on the full-text search aspect, because businesses tend to have needs for that, and they’re usually strong on storage (OnBase I know has a lot of flexibility here, that is genuinely useful to businesses), but taxonomic organization is not a concern for them.
They’re too oriented toward pure storage, data longevity, and archiving. This is Perkeep and similar “long-term” archival systems. It’s not that archiving and longevity are unimportant (like I said: nobody likes to lose data), but to me, it sort of misses the forest for the trees. I rarely want to store data merely for the sake of storing it, instead I want to derive insights, augment my memory, or have an “auditable”⁸ record of some event. Maybe there are systems out there that combine the two, but I haven’t seen one yet.
They’re too oriented toward plain text, and lack support for bringing non-plain text documents into their sphere. Wikis fall into this category, and they’re great if you’re able to fit everything you want to record in a wiki. They’re frequently imperfect in their UX, but for the key cases of capturing evolving data and capturing operational knowledge (about your business, about your life, etc.), they tend to be the best tool for the job.

Everyone who uses existing knowledge management systems suffers for this:

Enterprises end up with a slapdash house of cards, suffused with inconsistent process, and half-used features. I know I’ve seen enough Confluence pages with attached Excel spreadsheets, or PDFs, or Word documents, to last me a lifetime. Meanwhile, some older documents live on SharePoint, which does a passable job with search on PowerPoints, PDFs, etc., but butchers wikis so badly that I’m surprised there hasn’t been a class action lawsuit. They’ll never be migrated to Confluence, because to do so would actually be a functional regression. Users of these enterprise systems suffer beacuse finding a canonical reference involves searching at least two, if not more, disparate systems.
“Regular” users suffer because they never have a system that integrates their operational knowledge with their operational knowledge—finding, say, “all the tradesperson invoices from 2020” ranges from “a few lookups” in the best case where you’ve already built a hierarchy around these concepts (Invoices → Tradespeople), to “manually trawl through everything trying to remember names” if you haven’t. And even if you’ve already built the hierarchy, you’re likely to have issues at some point: even highly organized people are likely to slip up at some point without the benefit of a consistent, computer-aided process. Take commercial plane flight for example: their process is highly aided by checklists and computers, and as a result, flying is the safest way to travel (per mile traveled), and it isn’t close⁹.

Given the advancements in the technology (technological solutions exist for all the functionality I’ve outlined) and the theory (I don’t believe anything I’ve said here is a completely novel approach/idea), there’s no reason that a modern system should make so much distinction between operational (evolving) knowledge (in the form of wikis) and snapshotted or frozen knowledge (in the form of fixed non-plain text documents). Better and easier knowledge organization would allow people to operate more efficiently in their business lives and their personal lives. We can, and should, do better.

A CORE (hereafter, just “a core”) is basically a ledger and processing center for bank accounts. In our particular case, we’re replacing our “Deposits” core, which houses all our deposit accounts (checking, savings, etc.), and in an odd edge case, also houses residential mortgages and certain types of revolving/installment loans (like HELOCs). This is, in no uncertain terms, a big fucking deal that requires a lot of business and technology work and coordination to execute. The current system has been in service since the 80s, and is predictably deeply integrated. ↩
PDF form of scanned original article, with the section titles in place. ↩
Text link, missing section title, but much more readable. ↩
The word hadn’t been coined in 1945 yet, and wouldn’t be until 1968 by Phil Bagley. ↩
The author remembers wikis starting to be used for game guides in 2000/2001, and using Wikipedia for “research” in high school, probably around 2004. ↩
One of these clever tricks involves the gross abuse of hardlinks. I’ve in fact implemented this once before, and it works the following way: each file is stored in a single central directory, that’s normally-hidden from the user. Directories are created that represent each tag/category as needed—even hierarchy can be represented this way (see the section on hierarchy in tagging). Then, each file is hard-linked into all of the hierarchy locations it belongs in (because things can be in multiple hierarchies/tags/categories!). From the user’s perspective, everything is in the right places, but we reduce the amount of painful disk space use. ↩
This can, and might frequently be, different from the creation date of the file—think documents that are scanned months/years after their creation as paper/physical documents. ↩
In the sense of able to be trawled through, not any auditing like certificate transparency, blockchains, etc. ↩
0.2 deaths per 10 billion passenger-miles for air flight. 150 deaths per 10 billion passenger-miles for driving. ↩

Including collections in Jekyll archives

2020-04-05T11:42:47-04:00

Recently, I decided to upload all my recipes onto my blog, as a convenient way to share them, any modifications I made, and the original source. It also allows me to have a “backup” of them, as the tool that I wrote also exports Paprika’s “importable” format (really just gzip + JSON).

The best way to do this was as a Jekyll collection, which lets me neatly keep them separate from the actual blog posts. However, I wanted my recipe categories to be used by jekyll-archives as part of tag generation. Normally, this is not supported, but Ruby monkey-patching allows me to commit the following crime in a Jekyll plugin (appropriately called fixup-recipe-tags.rb):

require "jekyll-archives"
require "jekyll"

module Jekyll
    module Archives
        class Archives
            alias_method :old_tags, :tags

            def collection_tags(collection_name)
                hash = Hash.new { |h, key| h[key] = [] }
                @site.collections[collection_name].docs.each do |p|
                    p.data["tags"]&.each { |t| hash[t] << p }
                end
                hash.each_value { |posts| posts.sort! }
                hash
            end

            def tags
                collections_to_tag = @config['collections']

                merged_tags = @site.post_attr_hash("tags")
                collections_to_tag.each { |collection|
                    merged_tags = merged_tags.merge(collection_tags(collection)) { |key, v1, v2| [v1,v2].flatten }
                }
                merged_tags
            end
        end
    end
end

It’s probably not idiomatic Ruby, but it allows me to add the following to the jekyll-archives section of _config.yml:

collections:
  - recipes

With that, the recipes are used for tags, but they’re not emitted into the category pages or date-based archives. They’re also not emitted into my “tagged” page, because that only works with the posts collection.

Making money via sloppy record keeping

2019-10-08T21:27:58-04:00

TL;DR: We leased a car, transferred the lease to someone else within the allowed period, GM Financial due to their sloppy record keeping, continued to think that we still held the lease and sent us a $1200 bill that I had to call them to get it cancelled. My thesis: this sloppy record keeping probably generates a little bit of revenue—if the bill is small enough, some folks will just pay it instead of contesting it.

Back in August 2016, we had earlier that year plunked down the $1000 to hold our spot in line for a Tesla Model 3, and figured we could drive our stoic old Camry until it hit the ground—we expected that would be at least a few years, since it had been running without any issues up until then.

Then, this happened. I was driving to work, with a trunkful of monitors and Hue doodads that a friend was buying off me, and got broadsided by an unlicensed driver who was driving a rental car they did not have permission to drive. The scene was sufficiently confusing to the responding officers (in a stroke of luck, this was down the street from Boston police headquarters) that they asked me to stick around just in case they needed to arrest the other driver (they did not).

Our venerable Blueberry unfortunately did not make it out—the damage was so severe that a repair would have cost nearly $12,000, and that was using the cheapest parts that the adjuster could find on the market (scrapyard parts, etc.). The true cost likely would have been closer to $15,000, and our insurance, rightfully, was not going to pay that. They paid us out the ~$7,000 value of the car, and went off to recover from the other involved parties. Part of me wishes we were still with them so I could ask how that all went—I imagine the attorneys earned their hourlies on that one.

We needed a new car PDQ—we were able to borrow from my parents for a few weeks, but we needed a long term ride. We tested a few things, did some research, and ended up with a Chevy Volt. We were big fans of most things about the car and ended up signing up for a 3 year lease, with a great rate thanks to some shrewd negotiating (read: we walked out on the original bullshit rate). We figured this would be a great way to ease into electric vehicles while having the ready backup of gas if we needed it. We signed for a 3 year lease and figured we would cross the bridge of getting rid of the lease when that time came.

That time came in July of 2018 when we priced out and paid for our Model 3. Luckily, we found someone to take over the lease, got everything done in record time, and got the lease assumption in under two wires:

We were leaving the country for a 3 week vacation
We were approaching <12 months left on the lease, at which point it could not be transferred anymore.

I continued to get mail from GM Financial/the dealer we leased through, but chalked it off as marketing. The bills were getting paid by the new lessee, and everyone was happy.

The lease ended about a month ago, and the new lessee returned the car—they were told that there would not be any additional fees for excess wear and tear/mileage, etc., and relayed that information to me as a courtesy. Imagine my surprise when last week, I receive a letter from GM thanking me for returning my vehicle, and asking me to pay $1200 in disposition fees, excess wear and tear fees, property taxes, and “other fees.” After a quick back-and-forth with the new lessee, I decided to call up GM Financial.

The good news is, once I got them on the phone, they were very pleasant! I was put on hold briefly while the Lease End specialist talked to the Lease Assumption specialist, and they came back with a “please disregard that bill, we see here that the lease was assumed.”

This got me thinking, though: how often does this happen for smaller dollar amounts, and the person recieving the bill does not contest it?

I suspect it’s not a lot—after all, the lease disposition fee was around $350, and the only small amount on the listing was the “other fees and taxes,” which dialed in around $50.

However, if this sort of sloppy record keeping is the standard, they must be sending out a ton of these letters. How many need to “hit” and be paid without question in order for the sloppy record to become a pathological target for the department to hit, because it makes money. Otherwise, what’s the incentive to not fix this? It certainly costs money when I call in to support (GM Financial’s help line has always been staffed by American staff, so they’re not paying low offshore costs), so why not fix the process so that they don’t ever have to deal with this again?

Mounting old Synology volumes in new hardware

2017-05-13T00:00:00-04:00

I recently upgraded my Synology NAS from a DS 214play that I’ve had for a few years to a DS 1515+, and bought two additional drives to go along with it. I also wanted to do a fresh start of the configuration and metadata, as I had been having some issues with my existing NAS (in addition to the performance issues that drove me to upgrade), so I did not want to just move the existing drives over and add the new ones to the same array.

I set up the 1515+, and started copying files over using SFTP mounts–however, I was getting abysmal speeds, 10-11 MB/s at best. Both pieces of hardware were connected to a gigabit network, neither was doing anything else at the time, but transfers were incredibly slow. Not wanting to wait a few days to transfer 3 TB of data, I set out to find a better way to transfer.

I knew Synology’s “Hybrid RAID” was just a Linux software RAID, which meant I should be able to mount it in the new Synology as well. I started by doing some exploring with mdadm, checking that the array was not degraded for some reason, etc. However, I couldn’t simply assemble it and mount it—under the software RAID is an LVM volume group. I started by dumping some state about the volume groups:

root@Hagal:/mnt# vgdisplay WARNING: Duplicate VG name vg1000: Existing zkpaaW-zNCs-xB1u-E0nW-afok-Up0N-uAhtB7 (created here) takes precedence over ijfPFm-3l2P-55UC- YzAt-Ps3h-K45B-TdTLHT WARNING: Duplicate VG name vg1000: Existing zkpaaW-zNCs-xB1u-E0nW-afok-Up0N-uAhtB7 (created here) takes precedence over ijfPFm-3l2P-55UC- YzAt-Ps3h-K45B-TdTLHT WARNING: Duplicate VG name vg1000: zkpaaW-zNCs-xB1u-E0nW-afok-Up0N-uAhtB7 (created here) takes precedence over ijfPFm-3l2P-55UC-YzAt-Ps3h -K45B-TdTLHT --- Volume group --- VG Name vg1000 System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 2 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 0 Max PV 0 Cur PV 1 Act PV 1 VG Size 3.63 TiB PE Size 4.00 MiB Total PE 952682 Alloc PE / Size 952682 / 3.63 TiB Free PE / Size 0 / 0 VG UUID ijfPFm-3l2P-55UC-YzAt-Ps3h-K45B-TdTLHT WARNING: Duplicate VG name vg1000: zkpaaW-zNCs-xB1u-E0nW-afok-Up0N-uAhtB7 (created here) takes precedence over ijfPFm-3l2P-55UC-YzAt-Ps3h -K45B-TdTLHT --- Volume group --- VG Name vg1000 System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 2 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 1 Max PV 0 Cur PV 1 Act PV 1 VG Size 3.63 TiB PE Size 4.00 MiB Total PE 952682 Alloc PE / Size 952682 / 3.63 TiB Free PE / Size 0 / 0 VG UUID zkpaaW-zNCs-xB1u-E0nW-afok-Up0N-uAhtB7

Aha! vgdisplay is telling me what I want to know already: I have a duplicate volume group, and the existing one that was created here (the new NAS’s VG) is taking precedence over the old one. Armed with the UUID there, I can rename the old VG:

root@Hagal:/mnt# lvm vgrename ijfPFm-3l2P-55UC-YzAt-Ps3h-K45B-TdTLHT vg1001 WARNING: Duplicate VG name vg1000: Existing zkpaaW-zNCs-xB1u-E0nW-afok-Up0N-uAhtB7 (created here) takes precedence over ijfPFm-3l2P-55UC-YzAt-Ps3h-K45B-TdTLHT WARNING: Duplicate VG name vg1000: Existing zkpaaW-zNCs-xB1u-E0nW-afok-Up0N-uAhtB7 (created here) takes precedence over ijfPFm-3l2P-55UC-YzAt-Ps3h-K45B-TdTLHT WARNING: Duplicate VG name vg1000: zkpaaW-zNCs-xB1u-E0nW-afok-Up0N-uAhtB7 (created here) takes precedence over ijfPFm-3l2P-55UC-YzAt-Ps3h-K45B-TdTLHT Volume group "vg1000" successfully renamed to "vg1001"

Once it’s been renamed, the next step is to activate the VG so that it gets a /dev entry and becomes mountable:

Once we’ve activated it, we can mount it via its /dev entry, and we can see our entire main storage volume there:

root@Hagal:/mnt# mount /dev/vg1001/lv test/ root@Hagal:/mnt# ls test/ @appstore @autoupdate camera-upload comix downloads homes logs music Plex @tmp videos aquota.group backups @cloudstation @database @eaDir @img_bkp_cache lost+found photo @S2S tv web aquota.user books @cloudsync @download games lightroom movies photos synoquota.db video

Once the volume group was mounted, I could copy files much faster than copying over the network allowed me—100+ MB/s vs. 10-11.

Important notes:

None of these operations should cause data loss, but I am not responsible for any data loss that may occur if you follow my instructions!
Be careful when copying the UUID for a rename.
More complex Synology setups may not work this easily—I did everything assuming you set up a single volume group, all the drives are in the same RAID array, etc.
- That said, you should be able to use these same tools on more complex setups, just with more care taken to find the right volume groups.
I had to reboot to get the drives back into a state where I could erase them and add them to an existing volume.
I would suggest not rebooting your Synology with the old drives plugged in—it is likely to pick up the old volume as a new volume and re-shuffle your volumes and shared folders.
These instructions should work on anything that supports LVM/mdraid and the filesystem on the drives (ext4 or btrfs).

Good luck!

Bringing Rust to C#: Oxide and Oxide.Http

2017-05-05T00:00:00-04:00

Rust is a language I’ve admired for a long time now, from a slight distance. I’ve read about the borrow checker, perused the standard crates, and read up on Cargo and the way that Rust applications and libraries are built, tested, and shipped. I appreciate its striving to be a systems-level language that also cares about safety and developer productivity.

I haven’t written as much Rust as I’d like to (though I did start a few small projects here and there), but that didn’t stop me from thinking that maybe some of its standard library features have a place in the C# world. I found myself particularly fond of the Option and Result types, and their ability to better the flow of my code. Option’s API is Nullable on steroids, and Result provides an elegant way to express an error that doesn’t require using out parameters or custom exceptions, while at the same time providing a delightful API that lets you build processing pipelines that preserve errors and lazily evaluate steps.

The adventure started when one afternoon about a month ago, when I decided I wanted to see if I could implement Option in C#. I knew I wanted to preserve as much of the Rust API as made sense, including the simple construction of Some and None as function calls: Some(5), None(), etc. I opened up Workbooks (use what you know, right?) and started hacking away. After a little while, I had my first pass at the Option API surface—I stuffed it in a Gist, Within a few hours, I decided to make it into a library called Oxide—after all, what else is Rust?

My initial commit brought in the API almost exactly as it was in the Gist. Over the rest of the day, I refined the API slightly, added a ton of tests (inspired by the Rust documentation’s example assertions), and wrapped up. A week later, I decided to add Result, which I implemented largely the same way (a base class, with derived Ok and Err classes).

Since then I’ve refined the API for both, added a priority queue implementation, added a small library of HTTP helper methods (Oxide.Http), received my first external contribution from Jérémie Laval who contributed a very nice set of convenience methods to enable async/await with Option, and finally published a NuGet (when I was forced to by wanting to use Oxide in another project but wanted to avoid submodules).

I hope to keep working on Oxide—there will probably be more APIs that I would like to borrow from Rust, or more functional/Rust-inspired API that would be useful for C# developers. Contributions of all kinds are welcome: bug reports, feature requests, documentation, etc. You can find Oxide on GitHub—please use the issue tracker there for everything. :)

iOS 10, CPBitmap, and you

2016-12-22T00:00:00-05:00

Editor’s note: this is an old post that I’ve published now. I’ve since found that CPBitmap files do contain a binary plist at the end, but it was not in the exact location described by most blog posts. I’ve got a bit of code written, but I’m not 100% happy with it yet, so it hasn’t been published!

For a long while, I’ve been using a photo of my cat Zooey as my iPhone’s background image. Recently, I wanted to replace it with a different one, but the picture of Zooey wasn’t anywhere on my phone. iOS doesn’t come with a way to save the background picture, but I figured it couldn’t be that difficult. It had to be somewhere on the phone, or in a backup, in some reasonable format—after all, my phone has to display it!

My first step was to start looking for where the file is on iOS—turns out it’s in /var/mobile/Library/Springboard/LockBackground.cpbitmap by default. If your phone is jailbroken, there are tools you can use to access the file, but my phone is not, so that was right out.

Luckily, with an unencrypted iTunes backup (iTunes backups preserve background images!), and a handy tool called iExplorer, I was able to find and extract LockBackground.cpbitmap. With this in hand, I set out to find what the format was, so that I could retrieve my image of Zooey.

The first thing I ran into was a reference to a converter service that someone had published years ago at http://cpbitmap.cleverbyte.com.au/. This is no longer up, but the same person had published the code to a CodeProject article. I downloaded the code, fired up Visual Studio, ran it, and attempted to run it on my file. It crashed, and the file format didn’t seem to match at all.

The next thing I found was many variations on a Python script that used the Python Image Library to extract the image, after skipping what the script claimed to be a binary plist header. None of these worked either—they almost all crashed after producing nonsensical image sizes (they were reporting image sizes 40-60k pixels per side, iPhone 7 background images are 750x1334).

After this, I started to look at the raw data itself, hoping to divine some patterns. The first thing I saw was that the file was not any container format. There was no magic number at the beginning—it was not a BMP, PNG, JPG, TIFF, binary plist, or anything that I or file(1) recognized. I started wondering if maybe this was not raw RGB data—in retrospect, I should have thought of this earlier: iOS would prefer to blit this file to GPU memory as fast as possible, and decoding a graphics format would just be a waste of time.

After some playing around with our Workbooks product, I discovered that what I had on my hands was RGB data—BGRA32 data, to be precise. Yet when I created images from it, they were…wrong. You can see the broken image here—it’s immediately obvious there’s some sort of “misread” in the pixels.

I’m not a seasoned graphics pro, so it wasn’t immediately obvious to me, but I tweeted about it and almost immediately got a message from Larry that my issue was likely a row stride mismatch somewhere (shortly followed by another Larry delivering the same message via tweet). After some discussion, Larry Ewing suggested that the image might be 8-byte aligned w/ some 0-padding for easy blitting to GPU/SIMD. I had been using a stride of 3000 (4*750)—I adjusted it to 3008 (the next multiple of 8), and got the correct image!

A sharp observer might point out now that the image was already 8-byte aligned before—after all, 750*4 is 375*8. My guess is that they’re padded not only for alignment purposes, but also because iOS may not always be storing 750-pixel wide images here. There may be a case where Apple is using the padding to both indicate the end of a row and to pad it for easy manipulation, with no visible changes (the extra 2 pixels won’t show up on screen).

I’m hoping to throw together a little bit of publishable code to decode known CPBitmap formats into something useful, so I would love to get my hands on more samples of CPBitmap files—it would be interesting to see if/how the format has changed with iOS version. If you happen to have an older iOS version installed and can dump the files, please upload them somewhere and send me the link!

Security Recipes in X

2016-07-04T20:11:52-04:00

A while ago, Barry Dorans (who works on ASP.NET security at Microsoft) tweeted that he was working with the Roslyn team to build security analyzers. In particular, security analyzers based on commonly seen mistakes on Stack Overflow, for example:

ServerCertificateValidationCallback always returning true
Cut-and-paste AES crypto with a deterministic IV

This got me thinking that what would be great is a collection of articles/code/blog posts that helps developers of all walks make good choices with regards to implementing security primitives. Starting from the basics (teaching how to hash, etc.), to symmetric cryptography, to asymmetric cryptography, TLS, etc. This could be a resource that could be linked to from Stack Overflow, referenced on Twitter, or used as a teaching tool.

To that end, I started exactly such a thing today! I’m structuring it roughly as a book right now, with chapters covering broad concepts (for example, chapter 1 is “Hashing” right now), and sections within that chapter covering subtopics (so far, I’ve only written one section, on using hashes to verify file content integrity).

The idea is to write in a conversational, approachable style, and provide each topic in digestible chunks, without going into excruciating detail about implementations. Developers do need to know which algorithms are recommended, but do not need to know about S-boxes, hash rounds, XOR shifts, and other details of how the algorithms are implemented.

I would like to eventually provide implementations in multiple languages. I started with C# because it’s what I’m most natural with, but eventually it would be great to have Ruby, Rust, C, Go, Javascript, and others represented.

I’ve created a GitHub repository that has what I’ve done so far. The code is licensed under the MIT license, and non-code pieces (ie. the prose that constitutes each chapter) are CC-BY-NC-SA 4.0. Any contributions are welcome—new languages, mistakes I’ve made (either in code or in prose), etc.

I’m still refining how the prose and code are structured. I like what I’ve done so far, with each section having code inline and a separate file containing all the code without the prose for easy digestibility, but it may not make as much sense to do that for languages that have better inline Markdown/code features (Jupyter notebooks, etc.).

Comments? Suggestions? Complaints? Find me on Twitter, or open up a GitHub issue on the repo!

Writing Hubot scripts using ES6+

2016-02-15T03:22:22-05:00

Since I discovered it shortly after moving into the world of Hipchat (and later Slack) from the world of IRC, Hubot has been one of my favorite tools to make my life better. I’ve always enjoyed ChatOps, from the early days when we simply called it “writing eggdrop scripts,” and Hubot brought ChatOps into the modern age with its infinite flexibility and common platform that everyone could build on top of.

Hubot itself is written in CoffeeScript, and traditionally, most scripts have also been written in CoffeeScript. Unfortunately, I don’t like CoffeeScript much—I’ve always found it to be an ill-fitting crutch for Ruby developers who didn’t want to learn JavaScript, lest they cut themselves on the sharp edges of braces. Meanwhile, ES6 (ES2015, really, but I’m set in my ways…) has brought some really nice things to JavaScript development. I’m not going to list any here, but take a look at kangax’s ES6 compatibility table for an exhaustive list of everything ES6 brings to the table.

I’ve been slowly converting the scripts we use internally at Xamarin to at least be written in JavaScript, if not ES6—there hasn’t really been any good guidance on how to plug ES6 scripts into Hubot until recently, and what there was seemed like it was only half of the story. Today I sat down and figured out what needed to be done to make ES6 scripts automatically work.

Step 1: Install a few packages

Install babel-register, babel-preset-es2015, and babel-plugin-add-module-exports. Visit the respective module sites to learn more about them, but the short story is that the first two will make sure Babel works, and the last package makes sure Babel exports CommonJS-style defaults so that simple require calls will work.

Step 2: Create a .babelrc

At the top-level of your Hubot repo, create a .babelrc file with the following contents:

{
  "presets": [ "es2015" ],
  "plugins": [ "add-module-exports" ]
}

This will enable the ES6 preset and the module.exports plugin you installed earlier.

Step 3: Make sure Babel gets loaded early

Create a script that will be always be loaded first—I chose to name mine 000-import-es6.js. You can also make this a CoffeeScript script if you’d like, but I stuck with plain old JavaScript. The contents should look like this:

require("babel-register");
module.exports = function es6(robot) {};

The function export is not, in fact, required—it just makes Hubot shut up about expecting a function but receiving an object when checking module.exports.

Step 4: Profit!

You can now write scripts using ES6—all of the features are available to you to use. Put your scripts in the standard location for Hubot and they will happily be loaded and compiled at runtime—you’ll still get correct line numbers in stack traces though, for which I am infinitely thankful.

For Module Authors

If you’re authoring a Hubot module outside of your Hubot source tree, the process is almost exactly the same—at step 3, instead of creating a 000-import-es6.js file, you can create an index.js in the root of your package, with contents similar to this:

require("babel-register");
var realDefault = require("./src/foo");
module.exports = realDefault;

A possible alternate solution is to require babel-register, then export a function that uses Hubot’s robot.loadFile method to load your actual script entry point—I haven’t tried this, so I don’t know how well it would work, but I suspect it would be just fine.

Sports For Nerds: Gridiron Football, Part 1

2015-11-02T04:43:40-05:00

In my original post, I covered some of the reasons why everyone should have a basic understanding of sports and introduced the first sport I’m to cover: gridiron football, using the American rules.¹

In this post, I’ll cover some of the basics of the game: the field, the ball, the number of players per team, and the procedure of the game. However, first we need to define some units, for non-imperial readers:

1 yard = 3 feet = 36 inches = 0.9144 meters
1 lb = 16 oz = 0.454 kg

Those should be the only relevant units through the entire series—if I find that others are useful, I’ll introduce definitions then.

The Field

A football field is 120 yards long by 53.3 yards wide. 100 yards of the length are taken up by the field of play, with a 10 yard end zone at each end. The end zones are the goal areas—in order to score a touchdown, an offensive player must reach the opposing team’s end zone.

A set of goal posts rises from the end of each end zone. The goal posts are 18 feet, 6 inches wide, and extend 35 feet² in the air above their 10 foot high gooseneck mount—field goal tries and point-after attempts following a touchdown must pass through these posts in order to score points.

The field surface is generally composed of either natural grass or artificial surface, depending on what a particular team wants. Currently, of 31 stadiums in use³, 14 use artificial surfaces, and 17 use natural grass.

The Ball

The ball is a prolate spheroid⁴, 11 to 11.5 inches in length, 28 to 28.5 inches in circumference around the middle of the ball, and 20.75 to 21.25 inches in circumference around the pointed ends. It weighs 14-15 ounces, and is inflated to an internal pressure between 12.5 and 13.5 pounds per square inch, gauge.

The ball is made of 4 panels of cow hide leather, and is tanned brown. Along one seam, laces are inserted in order to provide a grip for holding and throwing the ball.

Game Length

The game consists of 4 quarters of 15 minutes apiece. Each team has 3 timeouts per half. After 2 quarters, there is a 12 minute halftime period. Teams have 40 seconds to execute a play (this is known as the “play clock”), and the game clock generally continues to run contemporaneously with the play clock. However, there are many occasions that stop the game clock: incomplete passes, players out of bounds, certain fouls, etc. We’ll cover these in more detail as they come up in their respective sections.

The Players

Each team normally has 53 players on their day-to-day roster. Players are platooned into offense, defense, and special teams (kicking and kick returning units). In the early days of gridiron football, players often played both ways (offense and defense), but with the advent of player safety rules, this practice fell out of favor, except when necessary due to injury.⁵ Teams are required to choose 7 players to be inactive for every game, bringing teams down to 46 players to be dressed for game days.

Teams may allocate their players as they wish—we’ll cover the typical distribution of players according to position once we cover the positions themselves—not all positions are needed in equal depth.

Despite 46 players being active for any given game, only 11 may be on the field at any time, in any phase of the game. Having more than 11 (even by accident), is a penalty; when on the defense, it advances the offense 5 yards, and when on the offense, it sets them back 5 yards.

Game Procedure

The actual procedure of an NFL game is complex. This section easily doubles the length of all of the others—and that’s without getting into procedural complications like penalties, onside kicks, fake kicks, etc. We’ll cover all of those in due time—first, we need to establish the basics.

Each game starts with a coin toss. The visiting captain calls the toss as it is in the air, and the winner of the toss gets to choose one of the following:

Receive or kick
Which goal to defend
To defer the choice to the second half

The loser of the coin toss gets to make the other choice. Immediately before the start of the second half, each team’s captains must inform the referee of their choices for the second half. The loser of the coin toss gets the first choice, unless the winner elected to defer the choice.

An example may be illustrative. If the coin is tossed, and I win it, I may choose one of the following:

To receive the opening kickoff
To kick the opening kickoff
Which goal I would like to defend
To defer the choice to the 2nd half

In the event I choose 4, I simply choose first at the beginning of the 2nd half, rather than the loser choosing first. In the event I choose 3, the loser of the coin toss chooses whether to receive or kick. In the event I choose 1 or 2, the loser of the coin toss chooses which goal they will defend.

After the coin toss, the team which is kicking off first (which team this is depends on the result of the coin toss!) will kick off, and the game will begin. The kick is off a kicking tee, from the team’s own 35 yard line— 10 players surround the kicker, and run down the field as the ball is aloft to “cover” the kick. They may not cross the line of the ball until it is kicked— doing so is a 5 yard penalty⁶ and causes a re-kick. Kickoffs are also not permitted to go out of bounds—an out of bounds kickoff causes the opposing team to automatically be given the ball 30 yards from the spot of the kick⁷.

The team receiving the kick can either choose to attempt to return it⁸, or let the kick go out of the end zone for what is known as a touchback, in which case, they will start play from their own 20 yard line. The game clock will start as soon as the player exits their endzone.

After the kick return, play proceeds as follows: each team has 4 “downs” (attempts) to move the ball 10 yards downfield. Every time they are able to do so, the downs counter resets back to 1, and they are given another 4 attempts. If they are unable to “convert” after 4 attempts, this is known as a turnover on downs, and possession of the ball reverts to the defense. Thus, most teams will not attempt all 4 downs, choosing either a field goal try or a punt after the 3rd attempt, depending on their position on the field (more on this in the special teams and coaching sections).

If a team elects to punt, a designated punter will punt the ball to the opposing team, who may choose to return the punt⁹, elect to take the field position wherever the punt lands (a fair catch), or let it bounce, which can result in a good/bad bounce (toward or away from your goal), the punt going out of bounds¹⁰, or a touchback. Muffed¹¹ punts can also occur, but their results are subject to a high degree of randomness, and we’ll cover them more in-depth in the special teams posts.

If a team is able to score via a touchdown, they are awarded 6 points, and may choose either an extra point try from the 15 yard line¹² for 1 additional point, or may choose to run another play from the 2 yard line, good for 2 additional points if they are able to make it into the end zone. If a team is able to score via a field goal try, they are awarded 3 points, with no possibility for extra points after the try.

Regardless of how a team scores, if they are able to score, the game repeats, starting with the kickoff phase again, following all of the same rules as before. The two teams thus go back and forth until the quarter ends, at which point they switch sides for the next quarter.

After 4 quarters are in the books, the team with more points is declared the victor. In the event of a tie, the rules are as follows: during the regular season, 1 additional 15 minute quarter is played, with a semi-sudden death format. Both teams are guaranteed at least one possession during overtime, unless the team which possesses the ball first scores a touchdown, in which case, they win the game. After each team has possessed the ball once, overtime shifts to a true sudden death format: any points scored by either team will end the game. If after the 15 minute period, neither team has scored, the game is declared a tie. Playoff rules deviate only in that an infinite number of overtime periods may be played. Initial possession of the football in overtime is determined with another coin toss, like the game-opening one.

Conclusion

That should, more or less adequately, cover the very basics of an NFL football game. The field size, players, ball, and game length are more or less set, with only minor effects on the game length possible via penalties and other in-game occurrences. The procedure of the game is much more fungible—penalties and turnovers regularly have a massive effect on the down & distance changes between plays. I’ll cover these more in the next few posts, as I break down the 3 phases of the game (offense, defense, and special teams) more. As usual, feel free to reach out to me on Twitter with feedback/requests for clarification/etc.

Canadian rules exist, but they differ significantly in key areas of the game. ↩
This was 30 feet, until a recently adopted rule change, originally proposed by Bill Belichick of the New England Patriots following a controversial loss to the Baltimore Ravens in 2012. ↩
The New York Jets and New York Giants share a stadium, MetLife Stadium, in East Rutherford, New Jersey. ↩
An egg. ↩
The last full-time two-way player in the NFL was “Concrete” Chuck Bednarik, who played center on offense, and linebacker on defense. Bednarik was known as a ferocious tackler, and retired in 1962, after a 13 year career with the Philadelphia Eagles. ↩
For example, the kick moves from the 35 to the 30. ↩
A kick from the 35 yard line would then be placed on the opposing 35 yard line. ↩
In this case, a designated “kick returner” will catch the kicked ball and attempt to run toward the goal with it. ↩
Much like with kickoffs, a “punt returner” will catch the punted ball and attempt to run toward the goal with it. ↩
An out of bounds punt grants field position wherever it went out of bounds to the team that was on defense. ↩
Dropped, or otherwise incorrectly handled. ↩
This is essentially a 33 yard field goal try. The rule previously was to attempt extra points from the 2 yard line, a 20 yard kick. This was changed for the 2015-2016 season to make the extra point try not an automatic conversion. At the time of writing, we’ve seen more missed extra point attempts in 7 weeks of the season this year than we saw in all 17 of last season. The rule is clearly working. ↩

Alas, A Website

Automate the Docker DNS Pain Away

The systemd units

docker-gen

dnscontrol

The config files

docker-gen.cfg

dnsconfig.js.tmpl

creds.json

Wrapping It All Up

Knowledge management in the modern era

Was this really predicted in the 1940s

How do wikis fit into this?

Failing #1: Poor support for non-plaintext knowledge

Failing #2: Poor taxonomy/organizational UX

How do document management/enterprise information systems fit in?

What do I actually want in a knowledge management system?

Storage is obviously critical

Support for all types of media is another must

Organization and search

Presentation & UX

Loading data into the system

Viewing the loaded data

Statistics and analysis

“Workspaces”

Mental models and techniques for this system

Conclusion

Including collections in Jekyll archives

Making money via sloppy record keeping

Mounting old Synology volumes in new hardware

Bringing Rust to C#: Oxide and Oxide.Http

iOS 10, CPBitmap, and you

Security Recipes in X

Writing Hubot scripts using ES6+

Step 1: Install a few packages

Step 2: Create a .babelrc

Step 3: Make sure Babel gets loaded early

Step 4: Profit!

For Module Authors

Sports For Nerds: Gridiron Football, Part 1

The Field

The Ball

Game Length

The Players

Game Procedure

Conclusion

The `systemd` units

`docker-gen`

`dnscontrol`

`docker-gen.cfg`

`dnsconfig.js.tmpl`

`creds.json`